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APERTA 





Preface to the 
Second Edition 


This treatment is an updated and expanded version of a book that was first 
published in 1994. Much has transpired in the intervening decade: new lab- 
oratory methods for uncovering molecular markers have been introduced 
and refined, statistical and conceptual approaches for estimating intraspecif- 
ic genealogy and interspecific phylogeny have been improved, and a vast 
armada of empirical examples has been added to a burgeoning scientific lit- 
erature. In some topical areas (e.g., fossil DNA and horizontal genetic trans- 
mission), earlier scientific thought has been completely overturned by molec- 
ular findings over the past 10 years; and knowledge on numerous other top- 
ics (e.g, vertebrate mating systems, ecological speciation, and life's deep 
phylogeny) has expanded greatly. On the other hand, the major types of 
questions tackled by molecular ecologists, behaviorists, and evolutionists 
remain much the same. Researchers stil! employ molecular markers to esti- 
mate and interpret evolutionary relationships of organisms along a temporal 
continuum ranging from clonality, genetic parentage and genealogy in the 
most recent generations, to phylogenetic affinities in ancient branches of the 
Tree of Life. This revised edition will further document how molecular mark- 
ers reveal otherwise hidden aspects of behavior, natural history, ecology, and 
the evolutionary histories of plants, animals, and microbes in the wild. 

Why is a treatment of this topic necessary when several excellent texts 
in molecular ecology or evolution already exist? Most of these books have 
focused on: proteins and DNA as primary objects of interest in their own 
right (e.g., Graur and Li 2000; Li 1997; Li and Graur 1991); broad conceptu- 
al issues regarding patterns, processes, or mechanisms of molecular evolu- 
tion (Ayala 1976a; Nei and Koehn 1983; Selander et al. 1991b; Takahata and 
Clark 1993); statistical or mathematical aspects of population-genetic or 
phylogenetic theory (Nei and Kumar 2000; Page and Holmes 1998); or 
detailed methodological procedures of data acquisition and analysis (Baker 


2000; Ferraris and Palumbi 1996; Hillis et al. 1996; Karp et al. 1998). Some 
textbooks and edited volumes have approached more closely what is 
attempted here (Baker 2000; Caetano-Anollés and Gresshoff 1997; Carvalho 
1998; Hoelzel 1992; Hoelzel and Dover 19912), but most of them are either 
popularized accounts (Avise 2001a, 2002) or else are restricted to research 
topic, laboratory method, or taxonomic group (Avise 2000a; Hollingsworth 
et al. 1999; Kocher and Stepien 1997; Mindell 1997; Phillips and Vasil 2001; 
Sibley and Ahiquist 1990; Soltis et al. 1992). No other classroom textbook or 
reference work quite fills the niche toward which this book is aimed: the 
wide world of biological applications for molecular genetic markers in the 
contexts of ecology, behavior, natural history, evolution, and organismal 
phylogeny. 

The first edition of Molecular Markers included references to about 2,200 
studies from the then-neophyte fields of molecular ecology and evolution, 
and this second edition approximately doubles that total count of citations 
from the primary literature. Thus, this compendium is again intended to 
provide a thorough introduction to relevant research that can serve both as 
an educational tool and stimulus for students, and an extensive reference 
guide for practicing researchers. Despite this coverage, an encyclopedic 
treatment of all relevant studies is no longer feasible because of the explo- 
sive growth of molecular ecology and evolution since the early 1990s. Thus, 
by necessity I have been selective in the choice of additional examples to 
illustrate various topics. I also retained many of the citations and examples 
(albeit updated) from the first edition, in part to provide historical perspec- 
tive (research approaches in molecular ecology and evolution have them- 
selves evolved), and in part to give due credit to pioneering works that 
should not be forgotten. Indeed, an important goal of this book is to describe 
not only the current state of biological knowledge derived from molecular 
markers, but also to trace how that current state of affairs has come to be. 

Like its predecessor, this second edition is not intended to be a detailed 
“how to" book on laboratory details and analytical methods of molecular 
ecology and evolution (although sufficient background is provided for 
beginners). Rather, this book is more of a “what-has-been-and-can-be-done” 
treatment intended to stimulate ideas and pique the research curiosity of 
young biology students and seasoned professionals alike. f hope this reno- 
vated edition will be read and enjoyed in this imaginative spirit of scientif- 
ic adventure. 
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Preface to the 
First-Edition' 


I never cease to marvel that the DNA and protein markers magically emerg- 
ing from molecular-genetic analyses in the laboratory can reveal so many 
otherwise hidden facets about the world of nature. Can individual plants 
sometimes exist as genetic mosaics derived from multiple zygotes? Is repro- 
duction by unicellular organisms predominantly sexual or clonal? What is 
the typical evolutionary lifespan of parthenogenetic all-female lineages, 
given that they lack recombinational genetic variation that otherwise might 
enable them to respond to changing environments? What is the genetic 
makeup of social groups within various species of insects, fishes, mammals, 
and other organisms whose behaviors might have evolved under the influ- 
ence of kin selection? In birds and other taxa, how often does intraspecific 
brood parasitism occur, wherein females surreptitiously "dump" eggs into 
the nests of soon-to-be foster parents? Do migratory marine turtles return to 
their natal sites for nesting? How often has carnivory evolved among plants? 
What are the evolutionary origins of cytoplasmic genomes within eukaryot- 
ic cells? How old are the fossils from which DNA can be extracted? How and 
how often have horizontal gene transfers taken place between distant forms 
of life? Have demographic bottlenecks diminished genetic variability to the 
extent that some populations can no longer adapt to environmental chal- 
lenges? How useful is the criterion of phylogenetic distinctiveness as a guide 
to prioritizing taxa and regional biotas for conservation efforts? These are but 
a small sample of the diverse problems addressed and answered (at least 
provisionally) through the use of molecular genetic markers. 

This treatment of molecular natural history and evolution is written at 
a level appropriate for advanced undergraduates and graduate students, or 


*Reprinted with slight modifications from the First Edition (1994). 








for professional ecologists, geneticists, ethologists, molecular biologists, 
population biologists, conservationists, and others who may wish a read- 
able introduction or refresher to the burgeoning application of molecular 
markers in their disciplines. I hope to have captured and conveyed the gen- 
uine sense of excitement that can be brought to such fields when molecular 
genetic markers with known patterns of inheritance are applied to questions 
about nature and evolution. I also hope to have provided a wellspring of 
research ideas for people entering the field. My goal is to present materia] in 
a manner that is technically straightforward, without sacrificing the richness 
of underlying concepts and biological applications, For the reader, the only 
necessary prerequisites are an introductory knowledge of genetics and an 
acute interest in the natural biological world. 

The fields of molecular ecology and evolution are at a stage where 
reflection on the past half-century may provide useful historical perspec- 
tive, as well as a springboard to the future. The mid-1960s witnessed the first 
explosion of interest in molecular techniques with the seminal introduction 
of protein-electrophoretic approaches to population genetics and evolution- 
ary biology. In the late 1970s, attention shifted to methods of DNA analysis 
primarily through restriction enzymes, and in the 1980s, mitochondrial 
DNA assays as well as various nuclear-DNA fingerprinting approaches 
gained great popularity. Beginning in the late 1980s, the introduction of 
PCR-mediated DNA sequencing helped to provide the first ready access to 
the "ultimate" genetic data: nucleotide sequences themselves. Nonetheless, 
it would be naive to suppose that direct DNA sequence information invari- 
ably provides the preferred or most accessible pool of genetic markers for all 
biological applications. Because of ease, cost, the amount or nature of genet- 
ic information accessed, or simplicity of data interpretation, several alterna- 
tive assay methods continue to be the techniques of choice for many eco- 
logical and evolutionary questions. Biologists sometimes are unaware of the 
arguments for and against various molecular-genetic methodologies, and 
one goal of this book is to clarify these issues. 

In scientific advance, timing and context are all-important. Imagine for 
the sake of argument that DNA sequencing methods had been widely 
employed for the past half-century and then protein electrophoresis was 
introduced. No doubt a headlong rush into allozyme techniques would 
ensue, on justifiable rationales that: the methods are inexpensive and tech- 
nically simple; observable variants reflect independent Mendelian polymor- 
phisms at several loci scattered around the genome (rather than as linked 
polymorphisms in particular stretches of DNA); and amino acid replace- 
ments uncovered by protein electrophoresis (as opposed to the silent 
nucleotide changes often revealed in DNA assays) might bring molecular 
evolutionists closer to the real "stuff" of adaptive evolution. To carry the 
argument farther, suppose that laboratory molecular genetics had been con- 
ducted throughout the last century but that some brave entrepreneurial sci- 
entist then ventured outdoors and discovered organisms, complete with 
phenotypes and behaviors! At last, the interface of gene products with the 





environment would have been revealed. Imagine the sense of excitement 
and the research prospects. 

These fanciful scenarios emphasize a point— molecular approaches 
carry immense popularity now, but nonetheless they provide just one of 
many avenues toward understanding the natural histories and evolutionary 
biologies of organisms. Studies of morphology, ecology, and behavior unde- 
niably have shaped the great majority of scientific perceptions about the nat- 
ural world. Molecular approaches are especially exciting at this time in the 
history of science because they have opened new empirical windows and 
novel insights on more traditional biological subjects. 

In this book, I have attempted to identify and highlight select case his- 
tories where molecular methods have made significant contributions to nat- 
ural history, ecology, and evolutionary biology. The treatment cannot be 
exhaustive because many thousands of studies have utilized genetic mark- 
ers. Rather, I have tried to choose classic, innovative, or otherwise interest- 
ing examples illustrative of the best that molecular methods, both old and 
new, have to offer. Overall, I have attempted to retain à balanced taxonom- 
ic perspective that includes examples from plants, animals, and microbes, 
and indeed I hope that common threads will be evident that tie together the 
similar classes of biological questions that frequently apply to such other- 
wise disparate organisms. 

This book is organized into two parts. Part I provides introductory 
material and background: the rationale for molecular approaches in natural 
history and evolution; the history of molecular phylogenetics; introductory 
outlines of various laboratory methods and the nature of genetic data that 
each molecular method provides; and brief descriptions of some.interpre———— — - 
tive tools of the trade, including molecular clock concepts and phylogenetic 
methods as applied to molecular data. 

Part II departs significantly from most other books in molecular ecolo- 
gy and evolution by emphasizing significant biological applications via a 
plethora of empirical examples. Topics are arranged along a genealogical 
continuum from micro- to macro-evolutionary: assessment of genetic iden- 
tity /non-identity and parentage; kinship and intraspecific genealogy; speci- 
ation, hybridization, and introgression; and assessment of mid-depth and 
deep phylogeny in the evolutionary Tree of Life. A concluding chapter deals 
] with the relevance of molecular studies to conservation biology and the 
preservation of genetic diversity. 
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Introduction 


The stream of heredity makes phylogeny; in a sense, it is phylogeny. 
Complete genetic analysis would provide the most priceless data for the 
mapping of this stream. 

G. G. Simpson (1945) 


This book is about molecular markers and their role in genetic studies of popula- 
tion biology, natural history, and evolution. Researchers routinely utilize the 
hereditary information in biological macromolecules (proteins and nucleic acids) 
to address questions concerning organismal behavior, kinship, and phylogeny. 
When used to best effect, molecular data are integrated with information from 
ecology, observational natural history, ethology, comparative morphology, physi- 
ology, historical geology, paleontology, systematics, and other time-honored dis- 
ciplines. Each of these traditional areas of science has been enriched, if not reju- 
venated, by contact with the field of molecular genetics. 

Interest in molecular ecology and evolution can center either on particular 
genes or proteins themselves (i.e., in genetic variation per se and ‘its functional 
roles in development, physiology, and metabolism) or on the-utility of molecules 
as genealogical markers for analyses of natural history and phylogeny. This book 
primarily addresses the second of these arenas. However, functional and 
genealogical understandings are mutually informative. For example, knowledge 
of the precise molecular basis and mode of hereditary transmission of a genétic 
polymorphism can be crucial to proper interpretation of molecular markers in a 
population context; conversely, patterns of population variation and divergence 
in molecular markers can be highly enlightening about the causal forces imping- 
ing on molecular as well as organismal evolution. 














4 Chapter 1 


With exceptional research effort, it is sometimes possible to identify and 
characterize the actual genes or chromosomal regions contributing to popu- 
lation variation in particular organismal features and adaptations. Such 
molecular-level dissections can give fresh insight into the mechanistic basis, 
as well as the evolutionary origins and maintenance, of pheriotypic variety 
(Jackson et al. 2002; Lynch and Walsh 1998; Purugganan and Gibson 2003). 
Another research objective, however, is to examine patterns of genetic vari- 
ation in appropriate "randomly chosen" proteins or segments of DNA. 
When these naturally occurring molecular tags are interpreted as genealog- 
ical markers, they offer extraordinary power to illuminate such topics as 
wildlife forensics, genetic parentage, reproductive modes, mating systems, 
kinship, population structure, dispersal and gene flow, intraspecific phylo- 
geography, speciation, hybridization, introgression, phylogeny, taxonomy, 
systematics, and conservation biology. 

Phylogeny is evolutionary history—that is, topology in the proverbial 
“Tree of Life." All organisms share certain features (most notably, nucleic 
acids as hereditary material) that suggest a single or monophyletic origin on 
Earth between 3 and 4 billion years ago. The proliferation of life has 
involved successive branching and occasional anastomosis (secondary join- 
ing) of hereditary lineages, with organisms alive today representing twig- 
tips in the now-outermost canopy of the phylogenetic tree. A complete 
understanding of phylogeny requirés knowledge of both branching order 
(cladogenetic splitting of lineages) and branch lengths (anagenetic changes 
within lineages through time). Occasional instances of lateral DNA transfer ; 
between branches (reticulate evolution), mediated by such evolutionary i 
processes as interspecific hybridization, establishment of endosymbiotic 
associations among genomes, or other means of horizontal gene flow must 
also be considered (see Chapters 7 and 8). 

Nearly all studies that utilize molecular markers can be viewed as 
attempts to estimate genetic relationships somewhere along a hierarchical 
continuum of evolutionary divergences ranging from recent to distant : 
(Figure 1.1): genetic identity versus non-identity (as in distinguishing clone- i 
mates from non-clonemates in species that can reproduce both asexually 
and sexually), genetic parentage (biological maternity and paternity), 
extended kinship within the pedigree of a local deme, genealogical affinities 
among geographic populations of a species, genetic divergence among 
species that separated recently, to phylogenetic connections at intermediate 
and ancient branches in the Tree of Life. Different types of molecular assays 
provide genetic information ideally suited to different temporal horizons in 
this hierarchy, and a continuing challenge is to develop and utilize methods 
appropriate for each particular biological question (Parker et al. 1998). 

It is also befitting to orient this book around genealogy because of the 
central importance of historical factors and noriequilibrium outcomes in 
ecology and evolution. As noted by Hillis and Bull (1991), "Virtually all 
comparative studies of biological variation among species depend on a phy- 
logenetic framework for interpretation." If Dobzhansky's (1973) famous dic- 
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Macro-scale ———————————9 Micro-scale 


Phylogeny 


Figure 1.1 The hierarchical nature of phylogenetic assessment. (After Avise et al. 
1987a.) 


tum that "nothing in biology makes sense except in the light of evolution" 
is correct, then it might be appended that ^much in evolution makes even 
more sense in the light of historical genealogy." Brooks and McLennan 
(1991, 2002) have repeatedly emphasized the need for phylogenetic analyses 
in ecology and ethology, as have many others for more than two decades 
(e.g., Eldredge and Cracraft 1980; Harvey and Pagel 1991). Such calls have 
increasingly been heeded, and molecular phylogenetic analyses are now an 
integral part of modern appraisals of population genealogy (Avise 2000a), 
speciation (Barraclough and Vogler 2000), and interspecific evolution 
(Farrell 1998; Lutzoni and Pagel 1997). With genealogical relationships of 
individuals and species properly sorted out via molecular markers, the phy- 
logenetic origins and histories of all other organismal traits, as well as the 
ecological and evolutionary processes that have forged them, usually 
become much clearer. 


Why Employ Molecular Genetic Markers? 


In the pre-molecular era, standard approaches to estimating kinship and 
phylogeny necessarily entailed comparisons of phenotypic data from mor- 
phology, physiology, behavior, or other organismal characteristics amenable 
to observation. Molecular ecologists and evolutionists also employ the com- 
parative method, but the comparisons now include direct or indirect geno- 
typic information from nucleic acid and protein sequences. Why are such 
molecular features of special significance? 





5 





6 Chapter 1 


Molecular data are genetic 


Molecular data provide genetic information. This simple truism is of over- 
riding significance. Because phylogeny is “the stream of heredity,” only 
genetic traits are genealogically informative. Molecular assays reveal not 
only detailed features of DNA (or, sometimes, their protein products), but 
also variable character states whose particular genetic bases and modes of 
transmission can be precisely specified. Thus, from explicit knowledge of 
the amount and nature of genetic information assayed, statements of rela- 
tive confidence can be placed on molecular-based genealogical conclusions. 

This situation contrasts with the insecurity of knowledge concerning 
the precise genetic bases of conventional characters used to address organ- 
ismal relationships (Barlow 1961; Boag 1987). Seldom can scientists specify 
particular genes or alleles that govern the morphological, physiological, or ; 
behavioral traits traditionally surveyed in phylogenetic assessments. 
Indeed, some such taxonomic traits have been shown to be affected by 
environmental as well as hereditary factors. In plants, phenotypic or devel- 
opmental plasticity (wherein individuals can assume different forms dur- 
ing ontogeny in response to varying environmental milieus, ranging from 
intracellular to ecological) has long been recognized as a potent source of 
phenotypic variation (Clausen et al. 1940). The phenomenon is pervasive in 
the animal world as well (see review in West-Eberhard 2003), involving fea- 
tures ranging from leg forms in barnacles (Marchinko 2003) to gill-raker 
numbers in fish (Loy et al. 1999) to phenotypic components of mate choice 
in moths and birds (Ohlsson et al. 2002; Rodriguez and Greenfield 2003). 
For example, a significant fraction of the variance in morphometric features 
among taxonomic subspecies of the red-winged blackbird (Agelaius 
phoeniceus) proved to be due to nestling rearing conditions, as became evi- 
dent when progeny hatched from experimentally transplanted eggs con- 
verged on some of the morphological traits of their foster parents (James 
1983). Similarly, cross-fostered great tits (Parus major) were shown to have 
partially converged on the carotenoid-based plumage colorations of their 
foster fathers (Fitze et al. 2003). Although certainly important in ecology 
and evolution, such phenotypic variation per se can be misleading if inter- 
preted as providing genetic characters of immediate utility in kinship 
assessment or phylogeny estimation. 


Molecular methods open the entire biological world 
for genetic scrutiny 


Prior to the introduction of molecular approaches, most genetic studies 
were confined to a small handful of species that could be reared and crossed 
inthe laboratory or garden: bacteria such as Escherichia coli and their phages, 
Mendel’s pea plants (Pisum sativum), corn (Zea mays), fruit flies (genus 
Drosophila), and house mice (Mus musculus). From hereditary patterns across 
generations, the genetic bases of particular morphological or physiological 
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traits in these species were deduced. However, such analyses could hardly 
be expected to capture the full richness of diversity among the multitudi- 
nous genes within these study organisms, much less to embrace the broad- 
er flavor of genetic diversity across the Earth's other biota. In contrast, 
molecular assays can provide direct physical evidence on essentially any 
| DNA sequence or protein, and they can be applied to the genetics of any and 
all creatures, from microbes to whales. 


Molecular methods access a nearly unlimited 
pool of genetic variability 


The information content of a genome is enormous. For example, a typical 
mammalian genome consists of some 3 billion nucleotide pairs in a com- 
posite sequence roughly 100 times longer than the total string of letters in a 
20-volume encyclopedia. Each genome truly is an encyclopedic repository 
of information, not only encoding the ribonucleic acids and proteins that are 
the working machinery of cellular life, but also retaining within its 
nucleotide sequence a detailed historical record of phylogenetic links to 
other forms of life. The genomes of various bacterial species range in size 
from about half a million to more than 10 million base pairs (bp); those of 
unicellular protists range from 20 million to more than 500 billion bp, and 
those of multicellular fungi, plants, and animals range from about 10 million 
to more than 200 billion bp (Cavalier-Smith 1985; Graur and Li 2000; 
Sparrow et al. 1972). Molecular assays employed in ecology and evolution 
involve sampling, often more or less at random, dozens to thousands of 
genetic markers from such vast hereditary archives. 

The levels of genetic variation within most species also are incredible. 
Consider, for example, the 3 billion bp human genome, the first full exem- 
plar of which was draft-sequenced in 2001 (Lander et al. 2001; Venter et al. 
2001). From this and other less exhaustive molecular appraisals, it appears 
that randomly drawn pairs of homologous DNA sequences from the human 
gene pool typically differ at about 0.1% of nucleotide positions (Chakravarti 
1998; Stephens et al. 2001b). Thus, if a second human genome were to be 
sequenced fully, it would probably differ from the first at roughly 3 million 
nucleotide sites. Furthermore, the magnitude of nucleotide diversity in 
humans is near the lower end of the scale, compared with that reported in 
many animal and plant species (Li and Sadler 1991): Indeed, because of the 
recent “peopling of the planet” and the lack of long-term population struc- 
ture in extant humans (see Chapter 6), total genetic diversity within Homo 
sapiens is even lower than that within our closest primate relatives, chim- 
panzees and gorillas (Ruvolo et al. 1993, 1994). 

Complete DNA sequences are now available for numerous model 
species, including more than 100 prokaryotic microbes and a growing list of 
multicellular organisms of special interest in medicine, epidemiology, and 
comparative genomics (Hedges and Kumar 2002). However, for most appli- 
cations in population and evolutionary biology, full genomic sequencing is 
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unnecessary because, with far less intensive laboratory effort, molecular 
markers can be obtained that display ample variability for even the most 
refined forensic diagnoses and phylogenetic appraisals. 

Nearly 100 protein and blood group polymorphisms already had been 
surveyed among the major human races more than a decade ago (Nei and 
Livshits 1990), and in excess of 2,000 DNA. polymorphisms in the human 
gene pool had been uncovered by restriction enzyme analyses by that time 
(Stephens et al. 1990a; Weissenbach et al. 1992). The numbers of available 
molecular markers have increased dramatically since then (Boyce and 
Mascie-Taylor 1996; Cavalli-Sforza 2000; Donnelly and Tavaré 1997). For 
example, a recent analysis examined the statistical distributions of 1.4 million 
SNPs (single nucleotide polymorphisms) in sample databases from the 
human genome (Kendal 2003). For the sake of extremely conservative argu- 
ment, suppose that just 30 marker polymorphisms in humans were available 
for examination, each with the minimum possible two alleles (many loci in 
fact have multiple alleles), Rules of Mendelian heredity show that the theo- 
retical number óf different human genotypes that could arise from genetic 
recombination would then be 3%, or about 200 trillion. Approximately 6 bil- 
lion people are alive today, and roughly 13 billion people have inhabited the 
planet since the origin of Homo sapiens. Thus, even with this unrealistically 
small number of genetic polymorphisms, the potential number of distinct 
human genotypes would vastly exceed the number of individuals who have 
ever lived, and no two people (barring monozygotic twins) in the past, pres- 
ent, or foreseeable future would likely be identical at all loci. In human foren- 
sic practice, standardized assays of even modest numbers of highly allelic 
Mendelian polymorphisms (typicaly from microsatellite loci) provide 
“DNA fingerprints" so powerful that courts of law now routinely accept the 
results as definitive genetic evidence of individual identification and biolog- 
ical parentage (see Chapter 5). Molecular polymorphisms in other species 
likewise permit endless opportunities in wildlife forensics. 


Molecular data can distinguish homology from analogy 


A central challenge of phylogenetics is to distinguish the component of bio- 
logical similarity that is due to descent from a common ancestor (homology) 
from that due to evolutionary convergence from different ancestors (analo- 
gy). Evolutionary classifications, should be reflective of homologous traits 
that genuinely register genealogical descent. However, particular morpho- 
logical, behavioral, or other phenotypic features (the conventional data of 
systematics) often evolve independently as selection-mediated responses to 
common environmental challenges. 

For example, Old World and New World vultures share several adapta- 
tions for carrion feeding (soaring food-searching behavior, featherless head, 
and powerful hooked beak) that formerly were thought to indicate that these 
birds had close evolutionary ties to each other and to other diurnal birds of 
prey (Falconiformes). However, extensive molecular data later prompted a 
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competing hypothesis that New World and Old World vultures are only dis- 
tant evolutionary relatives, and that carrion feeding probably evolved more 
than once, independently (Seibold and Helbig 1995; Sibley and Ahlquist 
1990; Wink 1995). Many other such cases have been unveiled by molecular 
phylogenetic appraisals. For example, several species pairs of cichlid fishes 
from Africa's Lake Malawi and Lake Tanganyika are closely similar in exter- 
nal appearance, but molecular markers proved that the resemblance in each 
case evolved in convergent fashion from separate cichlid ancestors (Kocher 
et al. 1993). Likewise, molecular data showed that multiple adaptive radia- 
tions of Anolis lizards on various Caribbean islands entailed repeated con- 
vergent evolution of particular morphological attributes (Jackman et al. 1997; 
Losos et al. 1998). 

Referring to molecular phylogenetic approaches, the late paleontologist 
Stephen J. Gould wrote, “I do not fully understand why we are not pro- 
claiming the message from the housetops ... We finally have a method that 
can sort homology from analogy." Gould (1985) was reveling in the fact that 
when species are assayed for perhaps hundreds or thousands of molecular 
characters, any widespread and intricate similarities that are present in these 
biological macromolecules are unlikely to have arisen by convergent evolu- 
tion and, thus, must reflect true phylogenetic descent. With species' phylo- 
genies properly sorted out via molecular markers, the evolutionary origins 
and histories of other organismal phenotypes usually become far more 
apparent. In other words, molecular phylogenies provide an archival road 
map of biodiversity. 

This is not to say that particular molecular characters invariably are free 
from homoplasy (convergences, parallelisms, or evolutionary reversals that 
muddy the historical record). Indeed, some molecular features, considered 
individually, may be quite prone to homoplasy due to a small number of 
interconvertible character states and a sometimes rapid rate of change 
amohg them. For example, one of only four different character states (ade- 
nine, guanine, thymine, or cytosine) can occupy each nucleotide position in 
DNA, and one of only 20 different character states can occupy each amino 
acid position in a protein sequence. Thus, the phylogenetic power of macro- 
molecular sequences resides not so much in specific sites or residues, but 
rather in the extraordinary amount of cumulative information provided by 
vast numbers of ordered positions in lengthy chains of molecular sequence. 
Furthermore, some types of molecular character states, such as specific 
duplications, deletions, or rearrangements of DNA, are rare or unique events 
likely to be of single (monophyletic) evolutionary origin. These too can offer 
tremendous power, even individually, as informative road signs along the 
trail of phylogenetic history. 


Molecular data provide common yardsticks for measuring divergence 


A singularly important aspect of molecular data is that they allow direct 
comparisons of relative levels of genetic differentiation among essentially 


10 Chapter 1 


any taxa (Avise and Johns 1999; Wheelis et al. 1992). Suppose, for example, 
that one wished to quantify evolutionary differentiation within a taxonom- 
ic family or genus of fishes as compared with that within a taxonomic coun- 
terpart in birds. The kinds of morphological traits traditionally employed in 
fish systematics (e.g., numbers of lateral line scales, fin rays, or gill rakers or 
the position of the swim bladder) clearly have no direct utility for compar- 
isons with avian taxonomic characters (plumage features, structure of the 
syrinx, or arrangement of toes on the feet). Thus, in traditional systematics, 
no universal criteria were available to standardize comparisons between the 
fields of ichthyology and ornithology, much less across more disparate dis- 
ciplines such as entomology and bacteriology. However, birds, fishes, 
insects, and microbes (as well as nearly all other forms of life) do share 
numerous types of molecular traits. Ribosomal RNAs (rRNAs) and transfer 
RNAs (tRNAs) are examples of molecules with widespread, if not ubiqui- 
tous, taxonomic distributions, as are various genes encoding enzymes 
involved in central metabolic and biochemical pathways for the respiration 
and synthesis of carbohydrates, fats, amino acids, and the replication and 
expression of nucleic acids. 

The general notion of universal yardsticks in comparative molecular 
evolution is introduced in Figures 1.2 and 1.3. These graphs summarize 
reported levels of genetic divergence as estimated, respectively, by elec- 
trophoresis of several proteins encoded by nuclear genes and by sequencing 
of one mitochondrial gene among recognized taxa representing five verte- 
brate classes. By these empirical molecular criteria, the assayed congeners 
and confamilial genera of birds showed less genetic divergence than did 
many other vertebrate species of identical taxonomic rank (an unanticipated 
result, given birds’ often high anatomical differentiation; Wyles et al. 1983). 
Perhaps these avian taxa separated more recently, on average, than did many 
of their non-avian taxonomic counterparts, or perhaps they evolved more 
slowly at the molecular level. Whatever the explanation, comparative molec- 
ular genetic studies of this sort can be expanded to include almost any num- 
ber of taxa and genes, and such exercises frequently raise exciting conceptu- 
al issues about evolutionary processes that would not have been evident 
from traditional phenotypic or taxonomic inspections alone. 

In an early example of this “common yardstick” perspective, King and 
Wilson (1975) reviewed evidence that the assayed protein and nucleic acid 
sequences of humans and chimpanzees are only about as divergent as are 
those of morphologically similar species of fruit flies or rodents, but much 


Figure 1,2 One early exploration of a “universal genetic yardstick” for the verte- 9 
brates. Plotted on a common scale are means and ranges of genetic distance (codon 
substitutions per locus, as estimated from multi-locus protein electrophoretic data) 
among congeneric species (in parentheses are numbers of pairwise species compar- 
isons) within each of five vertebrate classes. Note the relatively small genetic dis- 
tances in assayed bird genera compared with those of many other taxa. (After 

Avise and Aquadro 1982.) 
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Figure 1.3. Another potential genetic yardstick for diverse taxa. Shown on a com- 
mon scale are genetic distances (in this case, sequence divergence in the mitochon- | 
drial cytochrome b gene) observed among congeneric and confamilial species with- | 
in each of five vertebrate classes. Each data point in a histogram represents the 

average genetic distance among species within a genus or family. Assayed bird 

taxa showed significantly less genetic divergence, on average, than their non-avian 

taxonomic counterparts. (After Johns and Avise 1998a.) 
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less divergent than those of many amphibian congeners. They speculated 
that the morphological differences between humans and chimpanzees, 
which led in part to the traditional placement of these species in different 
taxonomic families (Hominidae and Pongidae), might be due to evolution- 
ary changes at a few key sites of gene regulation with major phenotypic 
effects. Years later, researchers used microarray techniques and related 
molecular assays to address one prediction of this gene regulation hypothe- 
sis: that gene expression patterns might be better predictors than most struc- 
tural genes of important differences in organismal morphology, behavior, 
and cognition (Oleksiak et al. 2002). To test one aspect of this hypothesis, 
Enard et al. (2002a) compared transcriptional levels in various tissues of 
humans, chimpanzees, and other primates, and found that species-specific 
changes in gene expression had been greatest in the human brain. Further 
suggestive evidence for the importance of gene regulation and positive nat- 
ural selection in human evolution came from recent analyses that focused 
on detailed expression profiles of particular gene regions (Enard et al. 2002b; 
Hellmann et al. 2003) and from molecular findings of extensive local repat- 
terning of hominoid chromosomes (Locke et al. 2003). 

Another possibility, however, is that the perceived morphological dis- 
tinctness of humans from chimpanzees and other primates has been exag- 
gerated by anthropocentric bias. In a fascinating classic paper titled “Frog 
perspective on the morphological difference between humans and chim- 
panzees," Cherry et al. (1978) employed the anatomical traits normally used 
to discriminate among frogs (eye-nostril distance, forearm length, toe 
length, etc.) to quantify the morphological separation between humans and 
chimpanzees. By these criteria, morphological divergence between the two 
primates was large even by frog standards (whereas molecular divergence 
was not), a result interpreted as consistent with the postulate that morpho- 
logical and molecular evolution can proceed at very different rates. It is iron- 
ic, yet understandable, that this pioneering attempt to evaluate a compara- 
tive yardstick for morphological evolution came from a research laboratory 
otherwise devoted to molecular biology, where genetic comparisons across 
i diverse biota tend to come more naturally. 

He s Quite apart from helping to evaluate the probable importance of gene 
i regulation (as well as nonregulatory changes) in organismal evolution, the 
i comparative information content of molecular markers raises other impor- 
tant issues for taxonomy and systematics. Using extensive DNA and protein 
} sequences {interpreted in conjunction with paleontological evidence), might 
| it soon become possible to adopt a universally standardized and quantifiable 
scheme for classifying all forms of life (see Chapter 8)? If so, this would rep- 
resent a dramatic departure from traditional systematic practices, in which 
both the empirical data and their interpretation have often been quite idio- 
syncratic to each taxonomic group. This is not to imply that the overall mag- 
nitude of genetic divergence between species is necessarily the only, or even 
the best, guide to phylogenetic (i.e., cladistic) relationships within particular 
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taxonomic groups, but it is a potentially important means of standardizing 
and quantifying inter-group comparisons in ways that simply were not pos- 
sible prior to the molecular revolution in systematics. 


Molecular approaches facilitate mechanistic appraisals of evolution 


Ever since Darwin and Mendel, assessments of gross phenotype have been 
crucial in elucidating the general nature of spontaneous mutations, natural 
selection, and other evolutionary genetic forces. Today, comparative 
genomics provides previously inaccessible information about the funda- 
mental mechanistic basis of evolutíonary transitions among phenotypes. 
For example, through DNA sequencing and other laboratory approaches, 
various morphological and physiological mutations in Drosophila fruit flies 
and many other species have been characterized at the molecular level and 
shown to be attributable to specifiable molecular events, such as point 
mutations in coding regions, mutations in flanking and non-flanking regu- 
latory domains, and insertions of transposable elements (Carroll et al. 2001; 
Gerhart and Kirschner 1997; B. Lewin 1999). Homeotic genes are another 
class of loci in which genetic alterations well characterized at the molecular 
level can be of special phenotypie importance, in this case with respect to the 
evolution of fundamental body plans in metazoan animals (Box 1.1). As 
cogently stated by Lenski (1995), “Molecules are more than markers.” 

Such mechanistic appraisals fall somewhat outside the subject matter of 
this book, but a few examples nonetheless can be mentioned in which data of 
relevance to functional biology arise as a direct or indirect by-product of 
molecular genealogical analyses. For example, in quantitative genetics (the 
study of complex phenotypes), a now-popular enterprise introduced by 
Paterson et al. (1988; see also Lander and Botstein 1989; Lander and Schork 
1994) involves the use of DNA markers in conjunction with experimental 
crosses to map genomic positions of loci underlying polygenic traits (i.e. 
those influenced by multiple “quantitative trait loci” or QTLs) (Box 1.2). Also, 
increasing numbers of phenotypic attributes, especially in model species, are 
yielding to detailed molecular-level characterizations informed by phylogeny 
(e.g, Long et-al. 1998; Mackay 2001; Peichel et al. 2001) For example, 
genealogical reconstructions based on molecular data from the alcohol dehy- 
drogenase locus in Drosophila melanogaster revealed that a mutation conferring 
a higher capacity to utilize or detoxify environmental alcohols probably arose 
within the last million years (Aquadro et al. 1986; Stephens and Nei 1985). In 
house mice, sequence analyses of introns at t-loci on chromosome 17 indicat- 
ed that particular chromosomal inversions affecting embryonic development 
originated about 3 million years ago and accumulated recessive lethality fac- 
tors that spread globally within the last 800,000 years (Morita et al. 1992). 
Detailed molecular analysis of an esterase locus in mosquitoes indicated that 
the worldwide distribution of an insecticide resistance allele had resulted 
from the migrational spread of a single mutation, rather than independent 
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BOX 1.1 Homeotic Genes in Metazoan Animals 





Homeotic loci, first identified in Drosophila, are developmental genes that playa - 
key ontogenetie role by regulating the identity of body regions, such as particular 
thoracic or abdominal segments: Their salient effect on morphology is perhaps ™’ 
best registered when things go wrong: Mutations in homeotic genes sometimes 
cause the developmental transformation:.of one body region into the likeness of 
another, such as converting an.antenna into a leg that protrudes from.a fruit fly's 
head, or converting a two-winged into a four-winged fly. Although most such. 
mitations are quickly eliminated by natural selection, they nonetheless evidence 
the magnitude of the morphotypic influence routinely exercised by: nomente lodi: 
during normal development. 

Families of homeotic geries have proved to be widespread i in metazoan.» 
animals, and their evolutionary histories have been elucidated by: comparative 
molecular analyses. Best characterized is the Hox gene family, which apparent- 
ly arose early in metazoan evolution, then expanded greatly in number of loci ~ 
during the radiation of bilateral animals, and again with a probable tétra- == 
ploidization event early in vertebrate evolution. The net result’ of these repeat- 
ed gene duplications is the presence im various modern taxa of as many as à- 


- dozen oft-linked genes Specialized to orchestrate the development of specific 


body tepions. 


Major expansion of 1 ; n 2 i Y st 
central Hox genes Anterior Central ; Posterior i 


Vertebrate 
Cephalochordate 
Echinoderm 


Arthropod 
Nematode 
Priapulid 


Annelid 
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Flatworm 





Number and arrangement. of Hox loci-Intrepresentative metazoan:animals. Each rectangle: 
represents à Hox gene influencing anterior, central, or posterior-body segments: Horizontal. =- 
lines indicate gene arrangements (when known) from mapping. data. On the.leftis.a phyto-. : 
genetic.tree for these metazoans based on: ribosomal DNA sequences. (After.Cárroll et alz: 

2001 od de Rosa et al. 1999) 
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BOX 1.2 QTL Mapping 


Quantitative traits are phenotypic features influenced by multiple loci, or poly- 
genes. A: popular exercise in recent years is the employment of large banks of 
molecular markers to identify the numbers and chromosomal Jocations of quan- 
titative trait loci (QTLs) that coritribute to genetic variation in particular pheno- 
typic attributes. Such polygenic traits might be levels of acidity in tomatoes, rates 
of senescence in fruit flies, or.components of reproductive isolation between 
closely related species (see Chapter 7).. 

The QTL mapping approach requires the availability of legions of molecu- 
lar markers, scattered throughout the genome, that have been ordered 
(mapped) along the chromosomes of the species of interest. Data banks consist- . 
ing of dozens to thousands of such. molecular markers are now available for 
increasing numbers of model species, such às Mimulus monkeyflowers 
(Bradshaw et al. 1998), Helianthus sunflowers (Rieseberg et al: 1995a,c), 
Drosophila fruit flies (Macdonald and Goldstein 1999), and Homo sapiens (Weiss : 
and Clark 2002). A typical experimental approach is as follows: Two: pure-breed- 
ing strains or species that differ in many such molecular markers are crossed to 
produce Fy progeny, arid these hybrids are then backcrossed to. one or the other 
parental form; The-idea is then to monitor whether specific molecular markers 
*. teríd to co-associate (" 'co-segregate" ^) in this Dackcross generation with specific 

phenotypes of interest that also-distinguish thé parental forms (Paterson 1998; 
Tanksley 1993). When particular polymorphic markers of Known chromosomal 
location explain significant proportions of the variance among phenotypes in 
these backcross progeny; the deduction is that genes contributing to those phe- - 
notypes must be closely linked to those molecular markers. The approximate 
numbers and locations of genes-underlying polygehic phenotypes can then be ` 
estimated; often with the assistance of computer programs for anayaing the-sta- 
tisticalassociations (&.g., Basten et al. 2002). Fhe same basic rationale can also be 
„used to identify QTLs by: searching for statistical patterns of co-ségregation 
between phenotypes and molécular markers through known family pedigrees 
zextending acróss multiple generations. Some QFE analyses in the literature are 
remarkably refined in their capacity to pinpoint the.chromesomal locations of 
loci exerting influence aver: particular phenotypic traits (Luo‘et al. 2002)... 
Notwithstanding the current popularity of QTL mapping, thecapproach ^ 

has some lfmitations: First, any molecular polymorphisms that tend to Co-Seg- | 
regate with phenotypes of interest are not necessarily mechanistically responsi- 
ble for those phenotypes. Rather, they are merely physically linked to the 
responsible chromosomal regions, within which there may. be hundreds of can- ': 
didate genes. Second, only polygenes with relatively- major. effects (i.e, that 
explain perhaps. 1096-2075 or more of the phenotypic variance, depending on 
the power of the study) tan be detected by. QTE mapping. Third, polygenes 
-contributihg to a given phenotype often exert their influence differentially 
depending on the particular genetic backgrounds examined (e.g., Devliri et al. 
:2001; Hardy et-al. 2003; Muir and Howard 2001; Spencer et al; 2003). Such 

; epistasis (inter-locus interaction) between QTL loci and their genetic back- 

. grounds emphasizes the desirability of E conducting QTL mapping using a 
au Yariety of different strains: 
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mutations of different resistance alleles (Raymond et al. 1991). A similar con- 
clusion was reached for the global spread of a methicillin resistance gene in a 
pathogenic bacterium, Staphylococcus aureus (Kxeiswirth et al. 1993). 

Medicine has also benefited from genealogical insights from molecular 
analysis (see Avise 1998a; McKusick 1998; Rannala and Bertorelle 2001; 
Scriver et al. 2000). In addition to their routine use in the clinical diagnosis of 
numerous genetic disorders, molecular markers have been employed to 
assess whether specific genetic diseases (e.g, phenylketonuria in Yemenite 
Jews, Huntington's chorea in Afrikaners, or fragile X syndrome) are of mono- 
phyletic or polyphyletic evolutionary origin (Avigad et al. 1990; Diamond 
and Rotter 1987; Hayden et al. 1980; Richards et al. 1992). For example, DNA- 
level markers interpreted in conjunction with historical accounts revealed 
that about 90% of cases of variegate porphyria in South Africa trace to a sin- 
gle distinctive gene mutation that arose in Cape Town in the late 1600s (Hift 
et al. 1997). Molecular analyses of the distributions of specific sets of repeti- 
tive DNA elements likewise permitted researchers to identify the phyloge- 


netic roots and approximate evolutionary ages of three human genetic dis- . 


eases (involving the LPL, ApoB, and HPRT genes; Martinez et al. 2001). 
Additional examples of how DNA markers can inform epidemiology and 
medicine appear in later chapters. 

In general, by mapping variable phenotypic traits of species and taxo- 
nomic groups onto phylogenies estimated from molecular markers, scien- 
tists are transforming modes of inquiry into the evolutionary origins and 
histories of numerous organismal features (see Cliapter 8). 


Molecular approaches are challenging and exciting 


A tremendous appeal of molecular phylogenetics is the sheer intellectual 
challenge this discipline provides. Many discoveries in molecular biology 
clearly affect the practice of genealogical assessment, and some molecular- 
level phenomena now taken for granted were undreamed of even a few 
years ago. For example, nucleotide sequences in many multi-gene families 
tend to evolve in concert within a species and thereby remain relatively 
homogeneous. This process of "concerted evolution" (Ohta 2000; Zimmer et 
al. 1980), first noted by Brown et al. (1972), is due to the homogenizing 
effects of unequal crossing over among tandem repeats and to gene conver- 
sion events even among unlinked loci (Arnheim 1983; Dover 1982; Ohta 
1980, 1984; Smithies and Powers 1986). Concerted evolution means that 
multiple copies of a gene within such families do not provide the inde- 
pendent bits of phylogenetic information formerly assumed (Ohno 1970). 
Particular rRNA gene families, for example, are employed routinely as 
informative markers of phylogeny, a task that would be far more difficult or 
impossible if each of the hundreds of tandem gene sequences within a fam- 
ily evolved independently of all others. Thus, concerted evolution makes 
genes within multi-locus families far more tractable for phylogenetic analy- 
sis than would otherwise be the case. 


— asec — 
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Extant species 1 Extant species 2 


Figure 1.4 Possible allelic relationships within a multi-gene family. The two 
circles indicate gene duplication events from an ancestral locus, producing three 
extant genes a, b, and c. The three ellipses represent allelic separations, leading to 
the extant alleles a,, b}, and c, in species 1 and to a,, by, and c; in species 2. Genetic 
comparisons between a, and a,, b, and b, or c, and c, are orthologous, whereas all 
other comparisons (e.g., between a, and b, a, and c,, or a, and bj) are paralogous. 
Orthologous similarities generally date to times near the speciation event (but see 
later chapters and also the following for additional distinctions between a gene tree 
and a species tree: Hey 1994; W. P. Maddison 1995; Page and Charleston 1997, 1998; 
Slowinski and Page 1999). By contrast, paralegous similarities date to relevant gene 
duplication events, which could vastly pre-date speciation times of the organisms 
compared. However, under strong concerted evolution (see text), all or portions of 
ay by and c would appear more closely related to one another than to their respec- 
tive allelic counterparts in species 2. 


However, the sequences of multi-copy genes within a species do not 
always evolve in concert subsequent to the duplications from which they 
arose. This fact makes the fundamental distinction between the concepts of 
orthology (sequence similarity tracing to a speciation event) and paralogy 
(sequence similarity tracing to a gene duplication) important (Figure 1.4). 
Indeed, when estimating phylogenetic relationships from sequence data on 
multi-gene families, trying to disentangle the orthologous from the paralo- 
gous similarities, and then draw proper genealogical conclusions according- 
ly is a challenging intellectual and empirical exercise (Cotton and Page 2002; 
Page 1998, 2000). 

Another example of how molecular data can offer exciting new perspec- 
tives on phylogeny relates to the introduction of mitochondrial (mt) DNA 
approaches to population genetics in the late 1970s. Prior to that time, most 
biologists viewed intraspecific evolution primarily as a process of shifting 
allele frequencies, a perspective that fit well with the traditional language and 
framework of population genetics but failed to focus adequately on the 
genealogical component of population history (Avise 1989a; Wilson et al. 
1985). By providing the first accessible data on "gene trees" at the intraspe- 
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cific level, mtDNA methods forged an empirical and conceptual bridge that 
now connects the formerly separate realms of microevolutionary analysis 
(population genetics and ecology) and interspecific macroevolution (the tra- 
ditional arena of phylogenetic biology). The notion of gene trees has also 
raised intriguing conceptual challenges regarding the meaning of "organis- 
mal phylogeny,” which in a very real sense can be thought of as an emergent 
or composite property of multitudinous gene genealogies that have trickled 
through an extended sexual pedigree under the vagaries of Mendelian (and 
sometimes non-Mendelian) inheritance (see Chapters 3 through 7). 

Overall, phylogenetic studies on mtDNA have stimulated a wide variety 
of formerly unorthodox (but now mainstream) notions about evolutionary 
processes (Table 1.1). Similar claims can be made for molecular characteriza- 
tions of homeotic genes (see Box 1.1; Erwin et al. 1997; Knoll and Carroll 1999), 
transposable elements (Box 1.3; see also Chapter 8), introns (Gilbert 1978; Li 
| 1997), and several other molecular genetic systems, all of which are now appre- 
f 





ciated to play huge but formerly unimagined roles in organismal evolution. 


| BOX 1 3 Transposable Elements 





Perhaps tlie most unexpected and revolutionary finding in all of molecular 
evolution is-that the genomes of most species are riddled witlt roving pieces of 
DNA (Sherratt 1995); commonly Known as transposable elements (TBs), mobile 
elements, or "jumping genes.” These elements come in two broad categories: ' 
class 1 elements (retrotransposable elements, or RTEs), which transpose prolifer- 
atively by making RNA copies of themselves and reverse-transcribing those `. 
copies into DNA, which then inserts-into other genomic locations; and class I - 
elements; which move by exéising themselves from-one génomic site and rein- 
serting themselves into another: Class I elements are especially abundant in 
eukaryotes (organisms whose cells have distinct nuclei separated from cyto- 
plasm), whereas class H elements tend to be relatively more abundant in bacte- 
tia and lower eukaryotes: Class I RTEs come in various structural families and: 
‘subfamilies. One common distinction, for example, is whether an element is 
f flanked by two long terminal repeat sequences (LTRs; see figure) or not (asi in 
LINES, an acronym for long interspersed nuclear elements). 


^10 kb 








„Interior region with struétural genes. 





5’ LTR "T 3’ LTR 


General structure of one type of LTR retrotransposon. Shown.is a gypsy-like element from 
Drosophila, in which long terminal repeats (LTRs) flank genes, in this case; for capsid protein (gag), 
reverse transcriptase (pol), and envelope protein (env). (After McCarthy and McDonald 2003.) 
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Retrotransposable elements aré interesting for evolutionary as well as function- 
al reasons. They are similar in structure and mode of replication to infectious 
retroviruses (Coffin et al. 1997), and their proliferate nature makes them quin- 
tessential "selfish" or "parasitic" elements within cells. Although quite variable 
«in relative abundance, they and other classes of mobile elements often consti- 
tute huge fractions of plant and animal genomes (Brosius 1999), making up 
5096-8096 of the corn genome and 90% of the wheat genome (Flavell 1986; 
SanMiguel et al. 1996), for example, and about 40% or more of the genomes of 
many mafhmals (Smit 1999), including humans (Yoder et al. 1997). Mobile ele- 
ments tend to induce mutations in host genomes when they jump from spot to 
spot, and this factor, together with the suspected metabolic burden of their 
maintenance; produces conflicts of interest with their host cells.'In this-coevo- 
lutionary war, host genomes occasionally. win battles too, as evidenced by the 
fact that some former jumping genes appear to have been recruited to various 
cellular tasks that benefit their host (McDonald 1990, 1998). 

Two kinds of algorithms can be employed in computer-based searches of 
available genomic sequence for the presence of particular families of TEs (or. : 
any other specified gene. sequences). In the traditional method (often imple- 
mented in BEAST program; Altschul et al. 1997), à researcher. compares a spe- 

‘cific nucleotide sequence of interest (the “query”) with.one or more sequences 

‘in the database, looking for'significant matches, often arbitrarily defined as 
90% or more sequence similarity. The second method involves scanning the. ` 
database for defined structural features of particular sequences of interest. In 

"the case. of RTEs, these structural signatures can be two long stretches.of 

T nucleotide sequence (putative LTRs) that are (1) highly similar to each other, 
(2) in fairly close proximity. in the genome; and (3) themselves flanked by short 
target repent sites Poe and: McDonald 2003). j 





Why Not Employ Molecular Genetic Markers? 


Against these advantages of molecular genetic methods appear to stand 
only two major disadvantages: Considerable training is required of practi- 
tioners, and monetary costs are rather high (but also variable across meth- 
ods) by traditional systematics standards (Weatherhead and Montgomerie H 
1991). However, a fact sometimes overlooked is that most molecular | 

* 

* 
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genealogical assessments have proved to support (rather than contradict) 
earlier phylogenetic hypotheses based on morphology or other phenotypic 
characteristics. Thus, a complete molecular reanalysis of the biological à 
world is unnecessary for phylogenetic purposes. In such genealogical appli- i 
cations, molecular markers are used most intelligently when they address 3 
controversial areas or when they are employed to analyze problems in nat- 
ural history and evolution that fall beyond the purview or capabilities of tra- 1 
ditional nonmolecular observation. 1 
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TABLE 1.1 Twelve unorthodox perspectives on evolution prompted by molecular genetic 
findings on.animal mitochondrial DNA 


1. Asexual Transmission (Chapter 3) 
Cytoplasmic genes within sexually reproducing species normally exhibit clonal (uniparental, 
non-recombinational) transmission. 


2. A New Level in the Population Hierarchy 
Entire populations of mtDNA molecules inhabit somatic and germ cell lineages within each 
individual (Birky et al. 1989). 

3. Non-Universal Code (Chapter 8) 
Genetic codes in mtDNA sometimes differ among taxa, and also differ fróm the nuclear code 
formerly thought to be universal. 


4. Conserved Function, Rapid Evolution (Chapter 3) 
Considerations in addition to functional constraint are required to explain the rapid pace of 
animal mtDNA evolution. 

5. Lack of Mobile Elements, Introns, Repetitive DNA 
Genes with selfish motives gain no fitness advantage by becoming repetitive within an asexu- 
ally transmitted genome (Hickey 1982). 

6. Endosymbiotic Origins (Chapter 8) 
Eukaryotic organisms are genetic mosaics containing interacting nuclear and organelle 
genomes that are descended from what had been independent forms of life early in Earth's 
history. 

7. Intergenomic Conflicts of Interest 
Because of their contrasting modes of biparental versus uniparental inheritance, nuclear and 
cytoplasmic genomes have inherent evolutionary conflicts of interest (in addition to their evi- 
dent requirements for functional collaboration) (Avise 2003a; Eberhard 1980). 

8. Intergenomic Cooperation 
Multitudinous interactions between products of cytoplasmic and nuclear genes lead to expecta- 
tions of functional coevolution between the different genomes within a cell (Kroon and Saccone 
1980). 

9. Matrilineal Genealogy (Chapter 6) 
Mutational differences among mtDNA haplotypes record the phylogenetic histories of female 
lineages within and among species. 

10. Gene Trees versus Organismal Phylogenies (Chapters 4, 6, and 7) 

In sexually reproducing organisms, pedigrees contain multitudinous gene genealogies (gene 
trees) that differ in topological details from locus to unlinked locus, and may also differ from a 
composite population-level or species-level phylogeny. Thus, a species tree or cladogram is in 
actuality a statistical "cloudogram" (Maddison 1997), with a variance, of semi-independent 
gene trees. 


T1. The Historical, Nonequilibrium Nature of Microevolution (Chapter 6) 
Genealogical signals from various molecular markers indicate that historical idiosyncrasies and 
nonequilibrium population genetic outcomes are a sine qua non of intraspecific (as well as 
interspecific) evolution. 

12. Degenerative Diseases 
Genetic defects in mitochondrial oxidative phosphorylation provide a new paradigm for the 
study of aging and degenerative diseases (Wallace 1992; Wallace et al. 1999). 


Source: After Avise 1991a. 














The History of Interest in 
Genetic Variation 


In 1951, the problematic of population genetics was the description and 
explanation of genetic variation within and between populations. That 
remains its problematic 40 years later ... 

R. C. Lewontin, 1991 


Since their inception in the latter half of the twentieth century, the fields of molec- 
ular ecology and molecular evolution have been preoccupied with the functional 
role and the possible adaptive significance of genetic variation. This focus led to 
compelling conceptual and empirical debates that captured nearly everyone's 
interest, but also served to divert attention from what many researchers perceived 
as mundane applications of molecules as "mere" genetic markers. Thus, with rel- 
atively few notable exceptions before the mid-1980s (e.g., Selander 1982), most of 
the early research programs that employed molecular assays were preoccupied 
with uncovering functional variation and illuminating how natural selection 
operates at the levels of proteins and DNA. Only gradually did molecular mark- 
ers come to be appreciated on their own merit (even if many of them might be 
selectively neutral) for the empirical and conceptual richness they can bring to 
studies of organismal behavior, natural history, and phylogenetic relationships. 
This chapter traces the history of scientific interest in natural selection's role in 
maintaining molecular variation. It also describes why an understanding of that 
role, while extremely important, is seldom a precondition for employing mole- 
cules as genealogical markers. 
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The Classical-Balance Debate 


Classical versus balance views of genome structure 


Prior to the molecular era that began in the mid-1960s, the magnitude of 
genetic variability in animal and plant genomes was the subject of a long- 
standing controversy. Evolution has been defined as temporal changes in the 
genetic composition of populations (Dobzhansky 1937). Genetic variation is 
prerequisite for this process. Thus, a central empirical challenge for popula- 
tion genetics has always been to measure genetic variability under the ration- 
ale that such quantification would help to reveal the operation of natural 
selection as well as mutation, genetic drift, and other evolutionary forces 
(Gillespie 1987; Kimura 1991; Kreitman 1987; Li 1978; Ohta and Tachida 1990). 
Unfortunately, the exact genes or alleles responsible for the phenotypic varia- 
tion routinely observable within and among natural populations rarely could 
be specified explicitly. This problem of empirical insufficiency plagued popu- 
lation genetics throughout the first half of the twentieth century, as evidenced 
by the establishment of two diametrically opposed scientific opinions about 
magnitudes of genetic variation in nature. Advocates of the "classical" school 
maintained that genetic variability in most species was low, such that conspe- 
cific individuals typically were homozygous for the same “wild-type” allele 
at nearly all genes. Proponents of the "balance" view maintained that genetic 
variation was high—that most loci were polymorphic, and most individuals 
were heterozygous at a large fraction of genetic loci. 

Several corollaries and ramifications stem from these opposing schools 
of thought (Lewontin 1974). Under the classical view, natural selection was 
seen as a purifying agent, cleansing the genome of inevitable mutational 
variation. Deleterious recessive alleles in heterozygotes might escape elimi- 
nation temporarily, but were prevented from reaching high frequencies in 
populations because of their negative fitness consequences when homozy- 
gous. The classicists did not deny adaptive evolution, but they felt that the 
process was due to occasional selectively advantageous mutations that 
would quickly sweep through a species to become the new wild-type alle- 
les. Because little genetic variation was available to be shuffled into novel 
multi-locus allelic associations, recombination was viewed as a rather 
insignificant process compared with mutation. Furthermore, any genetic 
differences that might be uncovered between populations or species must 
be of profound importance (because of the low within-population compo- 
nent of variability). Central to the classical school was the concept of genet- 
ic load (Wallace 1970, 1991): the notion that genetic variation produces a 
heavy burden of diminished fitness, which in the extreme might even cause 
population extinction. This perception of genetic variation as a curse was 
forcefully summarized by Muller (1950), who predicted from genetic load 
calculations that only one locus in 1,000 (0.1%) would prove to be heterozy- 
gous in a typical individual. 
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The balance school, by contrast, viewed natural selection as favoring 
genetic polymorphisms through balancing mechanisms such as the fitness 
superiority of heterozygotes (Dobzhansky 1955), variation in genotypic 
fitness among habitats, or frequency-dependent fitness advantages (Ayala 
and Campbell 1974). Genetic variability was thought to be both ubiquitous 
and adaptively relevant. Deleterious alleles were not ruled out, but pre- 
sumably were held in check by natural selection and contributed little to 
heterozygosity. Because high variability was predicted for sexually repro- 
ducing species, no allele could properly be termed wild-type. Genetic 
recombination, therefore, assumed far greater significance than de novo 
mutation in producing inter-individual fitness variation from one genera- 
tion to the next. Furthermore, genetic differences among populations were 
perhaps of less importance (because of the high within-population com- 
ponent of overall variability). How much genetic variation was predicted 
under the balance view? Wallace (1958) raised a proposal that seemed 
extreme at the time, but not at all unreasonable today: "The proportion of 
heterozygosis among gene loci of representative individuals of a popula- 
tion tends towards 100 percent." 

The balance hypothesis gained support from several indirect lines of 
evidence: extensive phenotypic variation, which in wild populations of 
several well-studied species often proved to be genetically influenced and 
adaptively relevant (e.g., Ford 1964); a genetic underpinning for many nat- 
urally occurring morphological variants and fitness characters in popula- 
tions that could be experimentally manipulated (e.g., by inbreeding, or 
through "common garden" experiments in which the fraction of phenotyp- 
ic variation attributable to genetic influence could be estimated by control- 
ling for environmental effects); and fast genetic responses to artificial selec- 
tion exhibited by numerous traits of many domestic animals and plants 
(reviewed in Ayala 1982a). However, none of these or related observations 
permitted direct answers to the fundamental question: What fraction of 
genes is heterozygous in an individual and polymorphic in a population? 

An answer to this question requires that variation be assessed at many 
independent loci, chosen without bias with respect to magnitude of 
genomic variability. But this requirement introduces a catch-22 for any 
appraisal based on conventional Mendelian genetic approaches: Genes 
underlying a particular phenotype can be identified only when they carry 
segregating polymorphisms. In other words, genetic assignments for phe- 
notypie features traditionally were inferred from segregation patterns of 
allelic variants through organismal pedigrees, but this also meant that 
invariant loci escaped detection, and no accumulation of such data could 
provide an uncolored estimate of overall genetic variation. Other means 
were needed to screen genetic variability more directly, and in a manner 

. that allowed assay of an unbiased sample of polymorphic and monomor- 
phic loci. 
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Molecular input to the debate 


A fundamental breakthrough occurred in 1966, when independent research 
laboratories published the first estimates of genetic variability based on 
multi-locus protein electrophoresis (Harris 1966; Johnson et al 1966; 

Lewontin and Hubby 1966). This method involves separation of non-dena- 
tured proteins by their net charge under the influence of an electric current, 
followed by application of histochemical stains to reveal enzymatic or other 
protein products of particular, specifiable genes (see Chapter 3). Because 
invariant as well as variant proteins are revealed, this approach represented 
the first serious attempt to obtain unbiased estimates of genomic variability 
at a reasonable number (usually 20-50) of genetic loci. The empirical results 
were clear: Genomes of fruit flies and humans harbored a wealth of varia- 
tion, with 30% or more of assayed genes polymorphic in a population, and 
roughly 10% of loci heterozygous in a typical individual (Box 2.1). Especially 
over the next two decades, multi-locus electrophoretic surveys were con- 
ducted on hundreds of plant and animal species, and they likewise revealed 
levels of genetic variation that were often high, but also quite variable among 


; BOX 2. 1 Measures of Genetic Variability ` 
: ^ within a. Population 





For muilti-locus protein électtophoretic ata (or! other comparable classes of infor- 
‘mation; such as data from microsatellite loci), one useful measure of genetic 
diversity: ts population heterozygosity (H), defined as the mean percentage of 

-Joci heterozygous per individual (or equivalently, the mean | percentage of indi- 

-. viduals heterozygous per lacus). Estimates of H-canbe obtained by direct count 
froma raw data-matrix; the body of which consists of observed diploid geno- 

‘ typés, as in the following hypatheti¢al example involving Aer loci (A-H) scored. 
ineach of. five individuals (i): S 
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Here, diploid: Menobypes are e indicated by italie lowercase letters (each letter 
‘representing an allele), and hetérozygotes are Boldfaced. In this example, 11 of 
40 ed pue: are pereg aypar (H = 0.275). Eipicalenty H may be 
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species (Figure 2.1; see seminal reviews by Hamrick and Godt 1989; Nevo 
1978; Powell 1975; Ward et al. 1992). 

Protein electrophoretic techniques were not entirely new in 1966— 
indeed, crude methods had been available for nearly 30 years (see Brewer 
1970). Rather, the scientific impact of the landmark allozyme surveys lay 
primarily in the manner in which methods and data from the seemingly 
alien field of molecular biology were applied for the first time to long-stand- 
ing issues in population genetics. After the mid-1960s, contacts between 
molecular genetics and population genetics would only expand, and at an 
ever-faster pace. Today, these disciplines are thoroughly wedded. 

The data and conceptual orientations of the protein electrophoretic era 
exerted an overriding influence on research goals in population genetics foc 
the next quarter century. Although the allozyme revolution provided origi- 
nal information on genetic variation in natural populations, it also inhibited 
a stronger molecular genealogical focus in at least two ways. First, at each 
locus, protein electrophoresis reveals qualitative allelic products (electro- 
morphs) whose phylogenetic order cannot be inferred safely from the 
observable property—band mobility on a gel. Much of traditional population 


interpreted as the mean of the row or column: totals, which piden. direct. 
count heterozygosities for single individuals (h,) or single loci (hp. respectively. 
Heterozygosities also may be estimated. from observed frequeticies.of. : 
alleles (rather than genotypes), assunfing that fhe population i isin Hardy— - 
Weinberg equilibrium. Thus, h; = 1 - £4,2, where Ay, is: the: frequen fthe kth 
allele. Other common measures of population variability for such 
mean number of alleles per locus and the proportion of pálymorphic loci (P), > 
which is 0.6 in the above example. To avoid an expected positive correlation ` 
between P and sample aizo, a locus usually is considered polymorphic only if 
the frequency of the most common allele falls below à an. arbitrary: musst 
typically 0.99 or 0.95. x $ 
For molecular data that tnvolve restriction: sites or carico de en y 
along a particular stretch of DNA that can be thought of asohe locus, a useful. 
statistic summarizing heterozyjoslly at.the base-pair level i is riücleoüde diver- 
sity (Nei and Li 1979; Nel and ‘thjima 1981), or the mean sequence di 
between haplotypes (alloleg): p © X/fy,, where f, and fare the incies 
the ith and jth haplotypes In a population, and p, is the sequerice dit erg 
between them. Another Informativo measure is haplotype. diversity: 
‘This measure ls a DNAclavol analogie of i (defined above for protein elec- 
trophoretic data) becnuae its calculation entails no assessmentofthe magnitude | 
of genetic divergence between the alleles involved: Depending on the loci and 
species surveyed, nucleotide diversities within a population offen fall in the 
range of 0.001~0.020 (Stephan and Langley 1992), and haplotype ‘diversities 
may be above 0.5 for raplidly evaiviag markers such as microsatellites or ani- 
mal mtDNA. 
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Figure 2.1 Allozyme-based estimates of mean multi-locus heterozygosity. 
Shown are estimates of mean heterozygosity (H), per species, based on empirical 
surveys of nearly 2,000 species of vertebrate and invertebrate animals and plants. 
An average of more than 20 loci were scored per study. (Data for animals from 
Ward et al. 1992; data for plants from Hamrick and Godt 1989.) 
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genetic theory, built around the formative works of Fisher (1930), Wright 
(1931), and Haldane (1932), can be couched in the language of the expected 
frequency dynamics of such phylogenetically unordered character states 
under the separate or joint evolutionary forces of natural selection, genetic 
drift, gene flow, mutation, and recombination. Thus, protein electrophoresis 
produced data that could be interpreted using the language and perspectives 
of traditional population genetic theory, but at the price of diverting attention 
from the phylogenetic orientation that increasingly characterized other areas 
of evolutionary biology. Thus, 20 years after the allozyme revolution began, 
Lewontin (1985) concluded, "Population genetics is conceptually the study 
of gene lineages, [but] until now, the data to study such lineages have not 
really been available." 

Second, by necessarily focusing on issues of genetic variation per se 
(rather than on the genealogical content of the molecular information), the 
protein electrophoretic era stimulated new quests to refine and interpret 
data-based estimates of molecular variability. The field of empirical popula- 
tion genetics soon became further preoccupied with characterizing and 
quantifying genetic variation, first via more refined comparisons of proteins 
and later directly at the level of DNA. 


Questions of empirical refinement 


How much cryptic variation at the protein level remained hidden beyond 
the resolving power of conventional gel electrophoresis? To address this 
question, two general experimental protocols were followed. In "backward 
experiments" (Selander and Whittam 1983), protein variants of known 
amino acid sequence were subjected to electrophoresis to determine the pro- 
portion of known alleles that were detectable (Ramshaw et al. 1979). 
Unfortunately, this approach could be applied only in the few instances in 
which proteins already had been well characterized by other methods. In 
"forward experiments," assay conditions (buffer concentrations, thermal 
regimes, gel-sieving media, etc.) were varied so as to further discriminate 
alleles within the electromorph classes identified in the original tests. This 
approach often uncovered additional protein variants (particularly. at loci 
that were polymorphic in the initial assays), and it left a general impression 
that the original electrophoretic methods had revealed only the tip of the 
genetic variation iceberg (Aquadro and Avise 1982a, 1982b; Ayala 1982b; 
Bernstein et al. 1973; Benhomme and Selander 1978; Coyne 1982; Johnson 
1976a, 1977; McDowell and Prakash 1976; Milkman 1976; Prakash 1977). 
Unfortunately, data from these forward experiments were often difficult to 
interpret because the particular genetic bases of the polymorphisms seldom 
were clear from the assays. Thus, none of these refined methods gained 
more than transient popularity, nor did they offer good sources of utilitari- 
an genetic markers. 
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How representative of the genome were the variability estimates derived 
from protein electrophoretic loci? Because the availability of histochemical 
stains was a deciding criterion for the inclusion of particular proteins in most 
electrophoretic surveys, dehydrogenases and other enzymes of the glycoly- 
tic pathway and citric acid cycle were disproportionately represented in the’ 
assays. An initial concern was whether these proteins might provide misrep- 
resentative estimates of genetic variation at other protein-coding genes. 
Thus, for a brief time in the late 1970s and early 1980s, attention shifted to 
abundant proteins (membrane-associated, ribosomal, and others) as revealed 
by nonspecific stains and newly introduced two-dimensional electrophoresis 
(which separates proteins by isoelectric focusing in one dimension, and then 
by molecular weight in the second; O'Farrell 1975). For several species, some- 
what lower heterozygosities were revealed than had been estimated in the 
original allozyme surveys (Aquadro and Avise 1981; Leigh Brown and 
Langley 1979; Racine and Langley 1980; Smith et al. 1980). 

However, any lingering thoughts that genomes might harbor little 
genetic variation were conclusively dispelled by the subsequent explosion 
of direct DNA-level information. Indeed, researchers soon discovered that 
at most protein-coding genes, synonymous or "silent-site" nucleotide poly- 
morphisms (those not translated into amino acid variations) greatly out- 
number the non-synonymous or replacement substitutions that in effect had 
been the subject of earlier protein electrophoretic surveys. Furthermore, in 
most species, nucleotide polymorphism was found to be a nearly ubiquitous | 
feature of a wide variety of DNA sequences, both protein-coding and other- : 
wise (Li 1997; Nei and Kumar 2000). 





The Neutralist-Selectionist Debate 


Strangely, the discoveries of extensive molecular variation did not clinch the 
case for the philosophical perspective on genetic variation embodied in the 
balance school of thought. Instead, they prompted development of an alter- 
native explanation for molecular variability that was to assume a prominent 
role in population genetics to the present time. Under this strict "neutral 
mutation theory," alternative alleles confer no differential fitness effects on 
their bearers. The theory, as summarized by Kimura (1991), holds that "the 
great majority of evolutionary mutant substitutions at the molecular level 
are caused by random fixation, through sampling drift, of selectively neu- 
tral (i.e., selectively equivalent) mutants under continued mutation pres- 
sure." As applied to intraspecific variability, neutrality theory predicted that 
most molecular polymorphisms are maintained by a balance between muta- 
tional input and random allelic extinction by genetic drift. Neutralists did 
not deny the existence of high molecular genetic variation, but rather ques- 
tioned its relevance to organismal fitness. Because neutralists and classicists 
shared the views that balancing selection plays little role in maintaining 
molecular polymorphism, and that most selection is directional or "purify- 
ing" selection against deleterious alleles, neutrality theory also was referred 
to as neoclassical theory (Lewontin 1974). 
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Several points should be made clear at the outset. First, neutrality theory 
did not suggest that most genes or allelic products are dispensable (of course 
they are not). Rather, it proposed that different alleles at a locus aré function- 
ally equivalent, such that organismal fitness does not vary as a function of the 
particular genotypes possessed. Second, neutrality theory did not deny that. 
many de novo mutations are deleterious and are eliminated by purifying 
selection. Rather, the focus was on the supposed neutrality of segregating 
polymorphisms that escape selective elimination. Indeed, one cornerstone of 
neutrality theory was that functionally unconstrained nucleotide positions or 
genic regions are those most likely to harbor neutral variation and to exhibit 
the most rapid pace of allelic substitution. Third, neutrality theory did not 
fundamentally challenge the Darwinian mode of adaptive evolution for 
organismal morphologies and behaviors (although some extensions of neu- 
trality theory did propose a significant role for genetic drift at these pheno- 
typic levels also; Kimura 1990). Rather, neutrality theory developed in 
response to the intellectual challenge posed by the unexpectedly high levels 
of observed molecular variability. 

Neutrality concepts were introduced in the late 1960s (Kimura 1968a,b) 
and gained immediate widespread attention, due in part to a paper by King 
and Jukes (1969) provocatively entitled “Non-Darwinian evolution: 
Random fixation of selectively neutral mutations.” Indeed, the theory 
directly challenged the prevailing approach of naively extending to the 
molecular level neo-Darwinian views on the adaptive significance of nearly 
all differences in organismal phenotype (see Gould and Lewontin 1979). It 
is quite remarkable that within a decade, Kimura’s neutrality theory (and its 
intellectual offshoot, the “nearly-neutral theory”; Ohta 1992a) gained almost 
universal acceptance as molecular evolution's gigantic “null hypothesis"— 
the simplest possible conceptual framework for interpreting molecular vari- 
ability, and the basic theoretical construct whose predictions must be falsi- 
fied before alternative proposals invoking balancing or other forms of selec- 
| tion were to be seriously entertained. This is not to say, however, that the 
| neutralist-selectionist debate is fully resolved. 

The neutrality school has strong roots in the quantitative tradition of 
theoretical population genetics developed earlier in the twentieth century 
(Fisher 1930; Haldane 1932; Wright 1931). An elegant and elaborate' theory 
predicts the amount of genetic variability within a given population as a 
function of mutation rate, gene flow (where applicable), and population size 
j (Kimura and Ohta 1971). Conspicuously absent from the calculations are 
selection coefficients, because alleles are assumed to be neutral. Under strict 
neutrality theory, molecular variability is a function of neutral mutation rate 
and evolutionary effective population size, N, (Box 2.2). For example, the 
magnitude of heterozygosity expected for electrophoretically detectable 
alleles at equilibrium between mutation and genetic drift is given by 
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where uis the per locus per generation mutation rate to neutral alleles (Ohta 
and Kimura 1973). Figure 2.2 plots this expected relationship between H and 
N, for reasonable neutral mutation rates, and also shows the range of 
allozyme heterozygosities empirically observed for numerous animal species 
with indicated population census sizes. (Such comparisons should deal with 
species-level effective population sizes because the theory involves equilibri- 
um expectations over longer-term evolution.) 


BOX 2.2 Effective Population Size 





Not all individuals in a population contribute gametes to the next generation 

with equal probability. This realization has led to the concept of effective popula- 
tion size (N), originally proposed by Wright (1931). The effective population size. 
refers to the number of individuals in.an idealized popülation that would have  ': 
the same genetic properties (Such as inter-generational variance in allele frequen- 
cies due to chance sampling error) observed in the real population: Usually, N, is . 
much smaller. than N (the census size) for one or more of the following, reasons: 


1. Separate s sexes. Ini organisms with: separate sexes, one sx may be more com- 
mon than the other. Let-N,; and Np be census numbers of-males and [emps 
if such a populatiog. Then. effective Populations size due due tot this apai 
alone i is t tae tee M Ee E eo Met 





S DANN, 1 
vic BN: 
“This eue nd shows that, less Nis “Np. is less than the total cénsus sed 
“count (Ny + Ny): ARN He : Reta 


2. Fluctuations in opinon size. e. Most populstona i in: nature probably fluctuate à 
;.- greatly in size due to disease, changes i in habitat quality, predation, and so 
forth. The effective population size due to Such: fluctuations is equal to the 
“harmonic mean of breeding population sizes'across: generations. A harmonic: 
“mear is a fanction of the mean of reciprocals, örin a this: casg, Srm i: 






y uA p DERDE e = E 


where N; is-tlié population size in the ith generation and mis the number of. 
generations. A harmonic mean is closer to the smaller rather than to the larger’ 
ofa series of numbers being averaged, so N, can-be much lower thari most... | 
_S-population‘censuses. A severe reduction in population. size, called a ‘Popula: 
tion bottleneck,” can greatly depress evolutionary N.. 7 ; 


3. Cómbinatiori of: separate sexes and fluctuating sopitatlon’s size: If census ‘popule- SS 
tion sizes óf males and females are known across multiple generations, joint. 
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effects of the above factors on N, can be determined. For each generation, 
census sizes of males and females are converted to an effective size for that ` 
generation, as in (1). Then, the equation in (ZYis employed to fake. the har- 
fnonic mean of the single generation estimates. 


4. Variation in progeny numbers. Even-in- a. diee usi. population with XE 
<- numbers of males and females, some individuals may leave rnany.more 
progeny, than others, creating a large fitness variance across families. Only 
- when offspring numbers follow. a Poisson distribution, with a mean (and ::: 
hence variance) of 2.0 per family, does N, = N. In more realistic situations, in 
which the variance often exceeds the mean, N; is smaller.than-the census 
breeding population size (Crow 1954). Organisms with extremely. high : 
fecundity are particularly prone to gross disparities between N, and:N dueto | 
high variability iñ fertility across individuals Hedge et al. 1992; Ae 
i Hedgecock and $ly-1990). 


5. Other factors. Various other factors can also reduce N N dante to N. For exam- 
-EE ple, im a species composed of many subpopulations, each o£ which is.subject . 
‘to: periodic extinction and recolonization, the species as a hole can have a far 
> lower N; than might have been predicted had only composite:census sizes for 
j : c Spent generations been SERIA. ecu oipaw and Kimura 1900). ë 
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Figure 2.2 Predicted relationship between species effective population size and 
mean protein electrophoretic heterozygosity under neutrality theory. 
Expectations for two plausible neutral mutation rates (4j) are presented. Also 
shown (shaded area) is the general range of observed allozyme heterozygosities for 
numerous animal species as a function of present-day census population size (N, 
logarithmic scale). (After Soule 1976.) 
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BOX 2.3. Mean Times to Shared Allelic Ancestry 





Another way to formulate neutrality theory regarding the association between 
genetic variability and population size is through consideration of the expected 
frequency distribution of times to common ancestry among alleles. Imagine an. 
idealized population with non-overlapping generations and large constant size 
N. Suppose further that in each generation, individuals contribute to.a gamete 
pool from which 2N nuclear gametes are drawn at random (effectively with , 
replacement) to produce individuals of the next generation. THe probability that 
two gametes carry copies of the same allele from the prior generation is 1/2N. 
This is alsó the probability that tlie time to common ancestry of two alleles is 
one generation ago (G - 1). The probability that a pair of alleles és not identical 
‘from the prior generation is-1 — 1/2N. Thus, the probability that. these latter alle- 
les trace to an identical copy two generations ago is (1 ~1/2N) (1/2N). From. an 
* extension of such reasoning, the probability that two randomly cliosen alleles 
derive roma common ancestral allele that existed G generations ago is : 


(CC WO 07209 Gave 
6r approximately ERG E ; 
HES de ` (42N) eean X 


This gapation gives the probability. distribution o of times to common ics 
in terms of the number of generations (Tajima. 1983). The distribution is geo- 
metric, with mean approximately 2N. The mean time to shared haplotype-— +- 
ancestry. for mtDNA-genes-carche derived'similarly (Avise et.al. 1988); butin 
theory-is only one-fourth as large ds for autosomal nuclear genes, the differ- 
ence being attributable to-a twofold effect due to the haploid transmission of ` 
mtDNA and another twofold effect due to mtDNA's normal pattern af uni- 
parental transmission; .— 4 

The above theory : assumes that times to common ancestty Bic allelic pairs. 
are independent: Therefore, in interpreting empirical data for: any particular ' 
species-against these expectations (see Chapter 6); caution must be exercised - 

: because the history of lineage coalescence. within a real population.i imposes a . 
‘severe correlation on pairwise comparisons (Ballet al: 1930; Felsenstein 1992; 
Hudson 1990; ; Slatkin and Frideon EDS : : 


Except perhaps for some of the least abundant species, observed values 
of H have proved to be much lower than neutrality theory might predict. 
This conclusion generally holds even when mutations are assumed to be 
mildly deleterious (Nei 1983). One likely explanation for the relative pauci- 
ty of genetic variation is that long-term effective population sizes for most 
species are vastly smaller than otherwise might be supposed, given present- 
day census sizes. Another possibility involves repeated selective sweeps 
(see below). If large numbers of genes or sites are under directional selection 
even occasionally during evolution (as argued, for example, by Smith and 
Eyre-Walker 2002), then other loci physically linked to them will also tend 
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to show reduced variation via hitchhiking effects, a phenomenon sometimes 
referred to as "genetic draft" (Gillespie 2001). 

The dearth of variation from the neutralist perspective extends to the 
level of some DNA sequences as well (Box 2.3). For example, with regard to 
mitochondrial DNA alleles (which are maternally inherited), the expected 
mean time to common ancestry under neutrality theory is approximately 


G = Nio ` (2.2) 


where G is the number of generations and Nye is the female evolutionary 
effective population size (Avise et al. 1988). Figure 2.3 plots values of Neve) 
for various species as estimated from observed mtDNA haplotype differ- 
ences, using conventional calibrations of evolutionary rate for the mtDNA 
molecule (Brown et al. 1979). Most observed values fall orders of magnitude 
below neutrality expectations extrapolated from present-day census popu- 
lation sizes. In other words, despite extensive molecular genetic hetero- 
| geneity, less mtDNA polymorphism is observed than neutrality theory 
| might have predicted, at least for species that are numerically abundant 
today. Either the neutral mutation rate in mtDNA is much lower than nor- 
mally believed, or (more likely) evolutionary effective population sizes are 
vastly lower than contemporary census population sizes in most species (for 
| any or all of the reasons outlined in Box 2.2). 
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Figure 2.3 Current census numbers of females (Np) versus evolutionary effec- 
tive female population sizes (Nye) for 18 marine species with high gene flow. 
Effective population sizes were estimated from empirical data on mtDNA 
nucleotide diversities. Both axes are on a logarithmic scale. Most observed values . 
fall well below the hypothetical line if Np and Ny, were equal. (After Avise 2000a.) 
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Such discoveries raise a remarkable irony about the neutrality-selection 
debate that stems from the historical precedents of the classical-balance con- 
troversy. When extensive allozyme variation was uncovered in the seminal 
protein electrophoretic surveys, selectionists interpreted the observations as 
consistent with the balance view, and they sought (as described in the next 
section) to discover the selective forces responsible. At the same time, neu- 
tralists were facing the conceptual question of how to account for the pauci- 
ty of molecular polymorphism relative to neutrality expectations, given the 
mutation rates and population sizes thought to characterize most species. In 
summarizing this state of affairs, Nei and Graur (1984) concluded that 
"polymorphism is actually much lower than the neutral expectation and 
that if the bottleneck effect is not sufficient for explaining the observed level, 
the type of selection to be considered is not diversity-enhancing selection 
but diversity-reducing selection." The irony of this neutralist perspective in 
the history of the classical-balance debate is still not universally appreciat- 
ed by proponents of balancing selection. 

Another important aspect of the neutral mutation theory concerns pre- 
dictions about molecular evolutionary rates. Two aspects of these rates must 
be distinguished. With regard to shifts in frequencies of preexisting alleles, rates 
of neutral evolution should be greater in small populations. Genetic drift 
refers to random changes in allele frequency due to sampling variation of 
gametes from generation to generation, and it is a special case of the gener- 
al phenomenon of statistical sampling error, which is inversely related to 
sample size. However, with regard to the origin and substitution of new alleles, 
the rate of neutral evolution is, in principle, independent of population size 
and depends only on the mutation rate to neutral alleles. 

This latter conclusion can be demonstrated as follows. In a diploid pop- 
ulation of size N, there are 2N allelic copies of each nuclear gene. In time, 
descendants of only one of these copies will survive (i.e., only one is des- 
tined for fixation). The chance that any newly arisen neutral mutation will 
undergo random fixation is simply 1/2N. On the other hand, the probabili- 
ty that a new neutral mutation will arise in a population is 2Nu, where u is 
again the mutation rate to neutral alleles. It follows that the rate of fixation 
of new neutral mutations is the product of the origination rate of mutations 
and their probability of fixation once present, or 2Nu x 1/2N = u. In other 
words, the rate of nucleotide substitution in evolution under strict neutrali- 
ty equals the rate of mutation to neutral alleles. This simple conclusion was 
the theoretical basis for the important neutrality prediction that biological 
macromolecules could provide reliable ^molecular clocks" (see Chapter 3), 
irrespective of population size. 


Multi-locus allozyme heterozygosity and organismal fitness 


Empirically, does molecular variability matter to organismal fitness? 
Considering the protein electrophoretic results in the historical context of 
the classical-balance and neutralist-selectionist debates, it is hardly surpris- 
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ing that many empirical population geneticists soon began to address how 
natural selection might serve to maintain the newly discovered stores of 
molecular polymorphism. This problem was attacked in several ways. 

In the “multi-locus approach,” searches were launched for correlations 
between mean allozyme heterozygosity and life history attributes or fitness 
components. One early issue was whether protein variation might be corre- 
lated with environmental heterogeneity (Hedrick 1986; Levene 1953; Soulé 
and Stewart 1970), and some intriguing associations were reported. For 
example, Nevo and Shaw (1972) observed low heterozygosity (H) in bur- 
i rowing mole-rats (Spalax ehrenbergi) and attributed this outcome to selection 
for homozygosity in their supposedly constant subterranean niche. 
Similarly, Avise and Selander (1972) reported much lower values of H in 
cave-dwelling than in surface-dwelling forms of the fish Astyanax mexicanus, 
although they provisionally attributed this outcome to effects of genetic 
drift in small cave populations (similar conclusions were later drawn from 
mücrosatellite data for these fishes; Strecker et al. 2003). Two influential stud- 
les using experimental cages of Drosophila fruit flies reported significantly 
higher allozyme heterozygosities in populations maintained under variable 
| than under uniform environmental regimes (McDonald and Ayala 1974; 
| Powell 1971). 

More generally, Selander and Kaufman (1973a) suggested that genic 
| heterozygosity was high in small sedentary animals that experience envi- 
| ronments as coarse-grained patches of alternative habitat types (Levins 
| 1968), and significantly lower in large mobile animals that perceive envi- ` 
| ronments as fine-grained. Smith and Fujio (1982) concluded that allozyme 
heterozygosities in marine fishes were correlated with the degree of habitat 
specialization. Powell and Taylor (1979) summarized evidence that environ- 
mental heterogeneity in conjunction with habitat choice contributed to 
genotypic diversity, whereas Valentine and Ayala (1974) favored an envi- 
ronmental selection model consistent with an observed correlation in 
marine invertebrates between low allozyme variation and trophic resource 
stability over time (Ayala et al. 1975a; Valentine 1976). 

On the other hand, Sage and Wolff (1986) suggested that patterns of 
genic heterozygosity might be attributable not to varying environmental 
selection pressures per se, but rather to environment-dependent popülation 
histories and genetic drift effects. For example, several species of large 
d mammals in glaciated regions were noted to have lower allozyme variation 
than their counterparts in temperate and tropical regions, purportedly due 
i to serial population bottlenecks accompanying recolonizations of northern 
; latitudes following retreats of the Pleistocene glaciers. Such observations, 
although debatable in interpretation, raised an important general point: The 
extant standing crop of genetic variation must be a function of both the 
genetic diversity originally available to a species (its phylogenetic legacy) 
and contemporary processes, such as selection, gene flow, and the mating 
system, that further govern how that available variation is partitioned with- 
in and among populations. - 
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Positive correlations also were noted between magnitudes of allozyme 
heterozygosity and particular life history attributes, such as short genera- 
tion times and small body and egg sizes in bony fishes (Mitton and Lewis 
1989; but see Waples 1991a) and high fecundity, outcrossing modes of repro- 
duction, pollination by wind, and long generation times in plants (Hamrick 
et al. 1979). Within species, positive correlations were reported between 
individual heterozygosity estimates and a wide variety of phenotypic char- 
acters arguably associated with fitness (reviews by Mitton 1993, 1994, 1997), 
such as exploratory behavior in mice (Garten 1977), antler characteristics in 
deer (Scribner and Smith 1990), shell shape in bivalve mollusks (Mitton and 
Koehn 1985), herbivory resistance in pines (Mopper et al. 1991), disease 
resistance and other phenotypes in salmonid fishes (Ferguson and 
Drahushchak 1990; Wang et al. 2002), and growth rates in several animal 
and plant species (Ferguson 1992; Garton et al. 1984; Koehn et al. 1988; Ledig 
et al. 1983; Mitton and Grant 1984; Pierce and Mitton 1982; Singh and 
Zouros 1978). Of course, caution is indicated in interpreting such observa- 
tions because various of these physiological and developmental characteris- 
tics might be functionally interrelated or statistically non-independent. 
Correlations were also noted between allozyme heterozygosity and devel- 
opmental stability, the latter supposedly evidenced by lower phenotypic 
variance between individuals (Lerner 1954; Zink et al. 1985) or by lower 
“fluctuating asymmetry” (the difference between bilateral features) within 
individuals (Allendorf and Leary 1986; Leary et al. 1985; Palmer and 
Strobeck 1986; Van Valen 1962). 

. Another aspect of the multi-locus approach involved searches for 
molecular or metabolic features correlated with allozyme heterozygosity. 
Among the examined factors arguably associated with genic variation were: 
molecular size of the enzyme (Eanes and Koehn 1978); quaternary structure 
of the protein (Solé-Cava and Thorpe 1989; Ward 1977; Zouros 1976); fre- 
quency of intragenic recombination (Koehn and Eanes 1976); physiological 
role in regulating flux through metabolic pathways (Johnson 1976b); and 
enzymatic action on intracellular versus extracellular substrates (Ayala and 
Powell 1972a; Gillespie and Langley 1974; Kojima et al. 1970; see also 
reviews in Koehn and Eanes 1978; Selander 1976). 

Several difficulties attended attempts to interpret such multi-locus asso- 
ciations, beyond the obvious point that correlation by itself cannot prove 
causality: First, there is likely to be a reporting bias in favor of positive cor- 
relations, and the number of variables that can be examined is essentially 
limitless. Second, mean heterozygosity as estimated from small numbers of 
protein loci may not accurately rank-order specimens with respect to 
genome-wide variability (Chakraborty 1981; Mitton and Pierce 1980), 
unless, perhaps, individuals vary dramatically along an outbred-inbred 
continuum due to demographic cycles, fine demic structure, or mating 
behaviors (Mitton 1993; Scribner 1991; Smith et al. 1975; Smouse 1986). This 
point led some authors to conclude that associations of individual het- 
erozygosity with personal fitness were attributable not to differing levels of 
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| genome-wide variation, but rather to physiological advantages stemming 
from heterozygosity at the particular glucose-metabolizing or other 
| enzymes surveyed (or at tightly linked genes in the chromosomal blocks 
that they mark; Koehn et al. 1983; Mitton 1997). Third, most of the het- 
erozygosity correlates listed above involved weak trends for which excep- 
tions could be cited or alternative explanations advanced. For example, with 
} regard to the possible relationship between mean H and environmental het- 
| erogeneity, high genetic variability nonetheless was found to characterize 
some species inhabiting proverbially stable environments such as under- 
| ground caves and the deep sea (Dickson et al. 1979), and low genetic vari- 
ability certainly can result from demographic population contractions in 
any environment. 

Such correlational approaches to studies of allozyme heterozygosity 
were summarized by Ward et al. (1992). From a survey of the literature on 
more than a thousand animal species, they concluded that approximately 
21%-34% of the variance in mean protein heterozygosity could be attributed 
to taxonomic effects (e.g., fishes tend to have low H values, amphibians the 
highest such values), and that 41%-52% of the variance was related to pro- 
tein effects (including size of the protein molecule, subunit number, enzyme 
function, etc.). In another such meta-analysis of protein electrophoretic stud- 
j ies, Britten (1996) concluded that "selection, including overdominance, has 
} at most a weak effect at allozyme loci, and [this] casts some doubt on the 
widely held notion that heterozygosity and individual fitness are strongly 
correlated." This conclusion generally echoed a sentiment expressed 20 
years earlier by Selander (1976): "Notwithstanding the immense amount of 
effort expended in surveying (allozyme) variation ... , the sample sizes of 
loci are generally inadequate for satisfactory analyses ... : molecular hetero- 
geneity is too great." The various correlations involving heterozygosity list- 
ed above are intriguing, but their interpretations remain controversial. 

In recent years,.a few studies have revisited multi-locus heterozygosi- 
ty-fitness issues using polymorphic DNA-level markers (e.g., Hansson et al. 
i 2001; Thelen and Allendorf 2001). For example, heterozygosity at multiple 
microsatellite loci proved to be correlated with birth weight in red deer 
(Cervus elatus; Slate and Pemberton 2002) and with birth weight and neona- 
tal survival in harbor seal pups (Phoca vitulina; Coltman et al. 1998a). An 
accumulation of such analyses might someday help to disentangle three pri- 
mary hypothesis for such oft-observed heterozygosity-fitness correlations: 
"overdominance" precisely at the loci scored (i.e., the scored genes them- 
j selves affect fitness), “associative overdóminance" (in which other loci 
linked to those scored are actually under balancing selection), and genome- 
wide heterozygosity fitness effects. However, most available data remain 
inconclusive on such issues (Coltman and Slate 2003; Hansson and 
Westerberg 2002). At least as important will be the use of molecular mark- 
ers in conjunction with experimental studies that move beyond empirical 
correlations and attempt to test alternative causal mechanisms. For example, 
i by surveying microsatellite loci in experimentally inbred lines of Crassostrea 
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oysters, Launey and Hedgecock (2001) uncovered evidence for an abun- 
dance of deleterious recessive mutations as well as segregation distortions 
in F, hybrids, two results predicted for high genetic load under models of 
associative overdomiinance. Because it documented this basis for inbreeding 
depression experimentally, this study was quite relevant to the 25-year-long 
debate concerning correlations between growth rate and heterozygosity in 
bivalve mollusks. 


Single-locus allozyme variation and the vertical approach 


Frustration with such multi-locus approaches to assessing natural selec- 
tion’s role led other researchers to the "vertical" approach, wherein protein 
polymorphisms at specific loci were studied at multiple levels ranging from 
biochemistry, physiology, and developmental expression to transmission 
patterns, population dynamics, and ecological associations (Clarke 1975; 
Koehn and Hilbish 1987; McDonald 1983). Intensive studies of several such 
model $ystems all uncovered convincing evidence for differences between 
allozyme genotypes upon which selection probably operates (Table 2.1). 


Protein 

Alcohol dehydrogenase Drosophila (fruit flies) 

a-Glycerophosphate dehydrogenase Drosophila 

Carboxylesterase Drosophila 

Glucose-6 phosphate dehydrogenase; Drosophila 
6-phosphogluconate dehydrogenase 

Glucose-phosphate isomerase Colias (butterflies) 

Glutamate pyruvate transaminase Tigriopus (copepods) 

Lactate dehydrogenase Fundulus (killifish) 


Leucine aminopeptidase Mytilus (mollusks) 





“In each case, documented kinetic differences between allelic products suggested that the 
polymorphisms were maintained by natural selection. 
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Unfortunately, rather few polymorphisms were analyzed so intensively. 
Furthermore, most of the allozyme loci studied had been identified a priori 
as likely candidates for natural selection; thus, polymorphisms analyzed by 
the vertical approach undoubtedly constituted a biased sample with regard 
to issues of selective maintenance. 


Selection at the level of ONA 


As data at the level of DNA sequences increasingly became available during 
the 1980s and later, they too fed into ongoing neutrality-selection debates in 
molecular evolution (Orr 2002). These data came not only from nuclear 
genes, but also from loci housed in cytoplasmic organelles (McCauley 1995; 
Rand 2001). Because DNA-level assays typically were conducted one locus at 
a time (but see above), attention usually shifted from estimates of composite 
or “genome-wide” heterozygosity (as in the multi-locus allozyme era) to pos- 
sible molecular signatures of natural selection or other evolutionary forces on 
particular sequences of linked nucleotide sites. Many of these efforts 
involved detailed statistical analyses of DNA sequences. For example, from 
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a sliding-window analysis of nucleotide sequence divergence along the 
length of the ADH gene in Drosophila melanogaster, Kreitman and Hudson 
(1991) ideritified a sharp spike of variation consistent with the operation of 
balancing selection on one specific portion of the ADH locus. 

Dramatic evidence for long-term balancing selection also emerged 
from fine-scale molecular characterizations of genes of the MHC (major 
histocompatibility complex) in mammals (Klein et al. 1993, 1998; Takahata 
et al. 1992) and a self-incompatibility mating locus in plants (Clark 1993). In 
each case, different alleles in extant populations proved to have much 
deeper evolutionary separations than neutrality theory predicted. Indeed, 
the MHC complex in particular has become a paradigm molecular system 
for developing and evaluating statistical analyses of genetic data for evi- 
dence of natural selection (Garrigan and Hedrick 2003). Another such 
example is represented in Figure 2.4. Two other notable early quests to 
identify the molecular footprints of natural selection involved detailed 
analyses of genes encoding an esterase (Cooke and Oakeshott 1989; Odgers 
et al. 1995) and a superoxide dismutase (Hudson et al. 1997; Lee et al. 1985) 
in Drosophila. 
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Figure 2.4 Phylogeny of alleles at the Rym1 gene. The Rpm1 gene is involved in 
pathogen recognition in Arabidopsis thaliana plants. Note that in several parts of the 
world, this species retains highly divergent alleles conferring pathogen resistance 
(R) and sensitivity (S), suggesting that long-term balancing selection has played a 
role in allelic maintenance. Note also the two distinct clusters of R alleles, diver- 
gence between which is apparently due to a historical recombination event between 
the R and S alleles at one end of the sequenced region. (After Stahl et al. 1999.) 
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Today, several statistical approaches are fairly standard practice for 
deducing the probable action of various forms of natural selection on DNA 
sequences (Bustamante et al. 2002; Ford 2002). One of the first such tests, 
proposed by Hughes and Nei (1988) in their analyses of MHC variation, asks 
whether replacement substitutions in a protein-coding gene outnumber 
silent substitutions. If so, positive selection on advantageous alleles is impli- 
cated, because neutrality theory predicts that synonymous substitutions in a 
gene should be more common than those that result in amino acid replace- 
ments. The Hughes-Nei test is highly conservative, however, because it 
normally detects only strong instances of positive selection (Sharp 1997). 
Another popular statistical method, introduced by Tajima (1989a), compares 
numbers of segregating sites and the mean number of nucleotide differences 
estimated from pairwise DNA sequence comparisons within a population. 
j The resulting statistic (Tajima’s D) is often used to test selective neutrality of 
j DNA sequences under an "infinite-site" model (Watterson 1975), but partic- 
i ular outcomes can also be affected by (i.e., indicative of) historical demo- 
j graphic events such as dramatic expansions in population size (Aris-Brosou 
and Excoffier 1996; Tajima 1989b). For addressing historical population 
| growth explicitly, Fu (1997) introduced a statistical test that distinguishes 
excesses of low-frequency alleles in an expanding population as compared 
with the number expected in a static population. 

i s Two other popular statistical approaches involve comparing patterns of 
? DNA sequence variation within and between populations or species. The 
i HKA (Hudson, Kreitman, and Aguadé) test (Hudson et al. 1987) addresses 
whether levels of DNA polymorphism and divergence are significantly cor- 
related, as might be predicted under neutrality theory because both the 
| extent of polymorphism of a gene and its evolutionary. rate are primarily 
functions of mutation rates to neutral alleles. The test requires an extensive 
series of DNA sequences both within and among populations, and it 
assumes that effective population sizes have remained roughly constant 
throughout the relevant evolutionary time frame. Examples of empirical out- 
| comes can be found in Moriyama and Powell (1996) and Wells (1996). 

i The MK test (McDonald and Kreitman 1991) is similar in its basic ration- 
; ale, but it examines, for any protein-coding gene, whether the ratio of non- 
| synonymous to synonymous nucleotide differences between related species 
; is the same as that within species. If the ratios differ, departures from neu- 
| trality may be indicated. For example, if replacement substitutions are rela- 
| tively more frequent as fixed differences between species than as polymor- 
| phisms within them, àn implication is that the nucleotide fixations were 
| selectively promoted; but if the ratio of replacement to silent substitutions is 
| significantly lower in the fixed differences between species than in intraspe- 
cific polymorphisms, then purifying selection against replacement substitu- 
tions may have taken place (for further explanation, see Nei and Kumar 2000; 
Sawyer and Hartl 1992). The MK test has been applied widely, with various 
forms of sequence-level selection sometimes deduced (and sometimes not; 
Eanes 1994; Hey. and Kliman 1993) in the genes of protists (Escalante et al. 
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1998), plants (Purugganan and Suddith 1998), and animals (Eanes et al. 
1993). Caution again is indicated, however, because the assumptions under- 
lying the MK test (e.g., lack of multiple substitutions at the same site, absence 
of codon usage bias, and lack of temporal variation in the mode of selection) 
could, if violated, compromise the interpretations (Akashi 1995; Eyre-Walker 
1997; Ohta 1993; Whittam and Nei 1991). In general, all such statistical mod- 
els entail assumptions (sometimes subtle) that often make their results less 
than definitive. : 

Reservations notwithstanding, these and other statistical tests for selec- 
tion on DNA sequences have proved extremely helpful in identifying puta- 
tive selective footprints in the genome (Golding 1994; Olson 2002). Especially 
noteworthy is empirical evidence for strong positive selection on replace- 
ment substitutions in certain protein-coding loci that apparently must deal 
with highly variable selective challenges. Examples of such genes are those 
that encode species-specific egg-sperm interactions in marine invertebrates 
(Galindo et al. 2008; Metz and Palumbi 1996; Swanson and Vacquier 1998), 
self-fertilization avoidance mechanisms in plants (Clark and Kao 1991), dis- 
ease resistance operations in mammals (Tanaka and Nei 1989; Zhang et al. 
1998), and digestive functions in primates (Messier and Smith 1997). 

Recent selection may also be signaled by abnormally low population 
variation in specific islands of linked nucleotide sites, an outcome that could 
evidence either "hitchhiking via selective sweeps" or "background selec- 
tion" (Figure 2.5). Especially when recombination rates are low, any posi- 
tively selected mutation that sweeps through a population to fixation 
inevitably carries along other tightly linked markers with which it hap- 
pened to be associated at its time of Origin. This genetic hitchhiking tem- 
porarily purges a chromosomal region of preexisting sequence variety, an 
effect that dissipates only gradually as genetic variation is restored by muta- 
tion (Dorit et al. 1995; Nurminsky et al. 1998). Similarly, background selec- 
tion against deleterious alleles can reduce the level of polymorphism in a 
DNA region with a low recombination rate (Charlesworth et al. 1993). 
Conversely, balancing selection on a target locus or nucleotide position can 
buffer against allelic extinction and thereby elevate levels of genetic poly-, 
morphism at linked neutral sites. 

Of course, any genomic signature of rratural selection deduced strictly 
from statistical inference about nucleotide patterns is merely a starting point 
for understanding how natural selection mechanistically operates on gene 
sequences. Further exploration can then focus more precisely on particular 
environmental or cellular selective forces that may be involved. 


The unresolved status of the controversy 


Neutrality-selection debates have dominated both the theoretical and 
empirical sides of population genetics for several decades (see Ayala 19762; 
Gillespie 1991; Lewontin 1974, 1991; Mitton 1994; Nei and Koehn 1983 for 
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Figure2.5 General effects of positive, negative, and balancing selection on lev- 
els of genetic variation at neutral sites adjacent to true targets of selection. Each 
diagram shows linked markers in a population of homologous chromosomes 
before and after the indicated mode of selection. Open circles are neutral positions; 
solid circles are mutations actually under functional selection. 


seminal reviews). Much has been learned, not least about the surprising 
intractability of the problem and the difficulty of obtaining the hard evi- 
dence required for its resolution. It has become far clearer, for example, that 
correlational studies (such as comparisons of multi-locus heterozygosity 
levels across environments or taxa), intriguing though they may be, are 
inadequate; that "vertical" gene-by-gene analyses of specific polymor- 
phisms are necessary, but extraordinarily difficult, and are likely to yield a 
biased sample of outcomes; and that statistical analyses of DNA sequences 
can help to detect particular departures from neutrality, but alone, fall far 
short of yielding what is ultimately desired: mechanistic descriptions of 
evolutionary forces that sustain genetic variation and govern molecular 
change. 

There are further reasons why these debates are not yet fully resolved. 
First, selection theory and neutrality are both immensely powerful concep- 
tual constructs in the sense that they can accommodate a wide variety of 
molecular outcomes simply by varying key assumptions and parameters. 
Because of the multitudinous ways in which natural selection can operate, 
falsification of all selectionist scenarios for a given empirical data set is nearly 
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impossible (indeed, this was a prime motivation for development of a quan- 
titative neutrality theory that specifies expectations explicitly). But neutrali- 
ty theory also can yield a nearly limitless array of predictions by altering 
parameters (N,, 4, and selection intensities for or against alleles that may be 
nearly neutral) that are notoriously difficult to measure in natural popula- 
tions. Furthermore, many apparent departures from neutrality expectations 
probably are due to idiosyncratic historical events that have removed popu- 
lations and species from the equilibrium conditions often assumed in various 
neutral models. 

Another challenge underlying the neutrality-selection controversy is in 
defining exactly what is meant by natural selection. For example, is the phe- 
nomenon of meiotic drive (whereby certain alleles appear to "cheat" during 
meiosis by distorting Mendelian segregation ratios in their favor) to be 
viewed as a form of natural selection at the gametic level? In general, are 
"selfish genes" that compete for transmission within an organismal lineage 
to be interpreted as evolving under the influence of selection? Holmquist 
(1989) and Avise (2001b) have argued that in sexual species, multitudinous 
quasi-independent genes and other DNA sequences (such as transposable 
elements) that co-inhabit an extended cell lineage form an interactive com- 
munity of oft-competing as well as collaborating entities, rather like indi- 
viduals in a miniature ecological community. This vast molecular interplay 
is quite different from what traditionally has been meant by natural selec- 
tion at the organismal level (Ohta 1992b), but it emphasizes the emerging 
view that natural selection can entail differential reproduction (i.e., differen- 
tial replicative fitness) at hierarchical levels below (Dawkins 1989) as well as 
above (Gould 1980; Wilson 1980) that of individual organisms. 

The issue of exactly what is meant by natural selection also arises in the 
context of interpreting genomic patterns. For example, even a single muta- 
tion under positive selection can promote a selective sweep that leaves a 
long-lasting footprint or genomic signature across a multitude of adjacent 
nucleotide sites (see Figure 2.5). Thus, the evolutionary dynamics and levels 
of genetic variation in a whole chromosomal region with a low recombina- 
tion rate can be governed or influenced by natural selection even if the vast 
majotity of pre- and post-sweep variants within that region are absolutely 
neutral in functional or operational terms. These two distinct senses or 
meanings of natural selection (mechanistic versus evolutionary dynamic) 
must be distinguished carefully in all discussions of molecular evolution, 
including those involving molecular markers. 

Notwithstanding these many difficulties of interpretation, it has become 

- clear that the final truth about selection versus neutrality lies somewhere 
between the two polarized views. One emerging consensus seems to be that 
natural selection often works most effectively on exon sequences, but typi- 
cally less so on introns and many spacer regions between genes. Another 
emerging view is that some DNA sequences unquestionably have been sub- 
ject at times to natural selection of various forms (balancing, positive, and 
purifying), whereas segregating alleles at many other loci have been mech- 
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anistically neutral, or nearly so, throughout most or all of their evolutionary 
durations. What remains contentious are the overall fractions and composi- 
tions of DNA sequences falling into these categories, the frequency distri- 
butions of selection intensities, and (in most cases) the exact operational 
connections between molecular variation and organismal fitness. 


Must Molecular Markers Be Neutral To Be Informative? 


Do the continuing uncertainties about the relative roles of selected and neu- 
tral mutations in evolution seriously compromise the utility of molecular 
polymorphisms as genetic markers? A common sentiment is that molecular 
markers gain their informativeness by being neutral, or nearly so, but this is 
a considerable oversimplification. In many microevolutionary applications, 
such as forensic identification and genetic parentage analysis, any non-neu- 
trality of chosen markers is normally inconsequential to the outcome. In 
other applications, such as in assessing pupulation structure and gene flow, 
genetic markers under intense selection could be misleading if interpreted 


‘under the assumption of neutrality (see Chapter 6). For example, strong bal- 


ancing selection via heterosis (fitness superiority of heterozygotes) can 
inhibit population differentiation in allele frequencies by random drift and 
result in uniform spatial patterns that could be misinterpreted as evidence 
for high gene flow under neutral models. Conversely, strong habitat-specif- 
ic or diversifying selection on particular marker loci could promote a false 
illusion of pronounced population isolation when actually there had been 
considerable genetic interchange. 

With regard to estimating species' phylogenies, many considerations 
apply, depending on the nature of selection, its effect on character state dis- 
tributions, and the methods of phylogenetic teconstruction. Balancing selec- 
tion can again complicate the effort by acting, for example, to retain partic- 
ular molecular polymorphisms across successive speciation nodes. 
Furthermore, different intensities of directional or diversifying selection can 
compromise some types of data analysis by generating significant rate het- 
erogeneity in DNA sequence evolution. However, different mutation rates 
at neutral loci also generate rate heterogeneity, yet these are dealt with rou- 
tinely in appropriate phylogenetic analyses. Indeed, even dramatic rate 
variation across genes or nucleotide sites can be of genuine benefit, because 
it offers researchers the opportunity to choose molecular markers ideally 
suited to the particular phylogenetic questions being addressed (see 
Chapter 3 and elsewhere throughout this book). 

When Lewontin (1991) asked whether the introduction of molecular 
methods had been a milestone or a millstone for evolutionary studies, it was 
primarily in the context of illuminating the nature of fitness and adaptation. 
From the perspective of providing utilitarian markers that open a world of 
novel opportunities for studying organismal relationships and behaviors in 
nature, the molecular revolution has been an unqualified success, as 1 hope 
this book will testify. 
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The Molecule-Morphology Debate 


Especially in the 1960s and 1970s, many systematists trained in traditional 
organismal disciplines viewed the nascent molecular revolution with con- 
siderable skepticism, if not consternation. Molecular biology and genetics 
seemed like alien fields, and their rapid growth threatened the long-stand- 
ing dominance of comparative morphology and behavior in systematics 
and phylogenetic biology. In some circles, this created an underlying tone of 
antagonism toward molecular approaches that persisted for quite some 
time. And it certainly did not help that many molecular biologists were ill- 
informed about ecology and evolution. 

In recent years, a more appreciative attitude has emerged with the 
recognition that molecular and organismal data can be reciprocally inform- 
ative, and indeed, often require each other’s services (Hillis 1987). For exam- 
ple, a new enterprise in comparative evolution may be termed “phyloge- 
netic character mapping” (see Chapter 8). This approach typically involves 
plotting the taxonomic distributions of morphological (or other) characters 
along molecule-inferred phylogenies, the intent being to uncover the evolu- 
tionary origins and histories of organismal attributes (Harvey et al. 1996). 
Phylogenetic character mapping also can be conducted in the reverse direc- 
tion—that is, by plotting the distributions of molecular characters along a 
morphology-inferred phylogeny. Several instances of horizontal gene trans- 
fer between otherwise unrelated organisms have been revealed under the 
compelling logic of this approach (see Chapter 8). Such enlightened analy- 
ses that attempt to capitalize on the cross-comparative information content 
of multiple classes of data are now leading to a more contented marriage 
between molecule-based and morphology-based approaches to ecology and 
evolution. Under this developing perspective, the interplay between alter- 
native lines of evidence becomes of greater interest and value than either 
data source considered alone. 

One entrenched notion in biology is that mean rates of morphological 
and molecular evolution are largely uncoupled: Different taxa sometimes 
evolve at grossly different rates with respect to phenotype (Simpson 1944), 
whereas molecular evolution often proceeds at a fairly steady pace (Wilson 
et al. 1977). Notable examples of this disconnect include "living fossils," 
such as species of horseshoe crabs that have remained nearly static in phe- 
notype for tens of millions of years, yet are highly divergent at the molec- i 
ular level (Avise et al. 1994); and conversely, phylogenetic groups such as i 
the cichlid fishes in Africa’s Lake Victoria that in recent times have radiat- t 
ed into arguably hundreds of morphologically recognizable species (Turner 
et al. 2001) that nonetheless often remain nearly indistinguishable at the 
molecular level (Meyer et al. 1990). Such apparent discrepancies in evolu- 
tionary rates for molecules and morphology helped to motivate the idea 
that changes in gene regulation may play a hugely disproportionate role in 
phenotypic evolution (Carroll et al. 2001; King and Wilson 1975). More 
recently, however, a broad review of empirical evidence led Omland (1997) 
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to conclude that rates of molecular and morphological evolution usually are 
correlated, and that earlier researchers understandably had emphasized 
the interesting exceptions. In another such meta-analysis, but with a differ- 
ent outcome, Bromham et al. (2002) found no evidence for an association 
between morphological and molecular evolutionary rates. 

Such rate controversies notwithstanding, the fact remains that most 
; molecule-based phylogenies mirror quite closely their predecessors based 
i on morphology, particularly within well-studied taxonomic groups. Why 
| should this be, given that much (certainly not all) of phenotypic evolution is 
Darwinian (adaptive), whereas much (certainly not all) of molecular evolu- 
tion seems to be nearly neutral? Two factors probably contribute. 

First, when integrated over long sweeps of time and across large num- 
bers of characters (morphological or molecular), overall magnitudes of 
divergence between speciés should tend to be correlated, at least crudely, 
with times elapsed since common ancestry. If molecules and morphology 
both march appreciably in step with the passage of time (either by accumu- 
lation of neutral mutations or via changing selection regimes), their histories 
will appear correlated, even if these two suites of characters themselves are 
functionally mostly independent. To the extent that functional links also 
exist (as they certainly must) between molecules and morphology, such cor- 
relations between the two would only be elevated. 

Second, overall phenetic or genetic divergence is not the only, or even nec- 
essarily the best, guide to phylogeny (see Chapter 3). Most phylogenetic algo- 
rithms in use today give added weight to specific character states that (for log- 
ically defensible reasons) should be most indicative of historical genealogy. 
Thus, when applied to either morphological or molecular traits, such algo- 
rithms are likely to converge on the one-and-true organismal phylogeny. 


Molecular Phylogenetics 


While the grand controversies described above were being played out, other 
researchers adopted the more pragmatic approach of simply applying molec- 
: ular markers to resolvable problems in natural history and genealogical 
| assessment. Beginning as a subsidiary endeavor in molecular ecology and 
ij evolution, this previously neglected approach has grown steadily and now 
occupies a position of central prominence in biology, as this book will attest. 
Advances in the molecular marker field have proceeded as a series of 
major waves, each initiated by the development of a new laboratory method. 
A typical pattern is as follows: A novel assay technique for proteins or DNA 
is introduced, and a flood of evaluative activity follows. Methods that fail to 
live up to advance billing (e.g., because of technical difficulty, poor repeata- 
bility in outcomes, or ambiguities of genetic interpretation) are abandoned. 
Approaches that survive the initial evaluations are then employed to address 
broad conceptual topics in the research paradigms of molecular evolution 
described above. For example, questions inevitably arise about what role, if | 
any, natural selection plays in maintaining molecular variation revealed by 
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the new laboratory techniques. Observational or experimental data are gath- 
ered and evaluated against the predictions of neutrality theory. Discussions 
also ensue about how best to analyze and interpret the new classes of molec- 
ular data. 

In the meantime, genetic markers provided by each new method are 
applied to interesting problems in natural history or evolution where their use 
appears appropriate. Success in such endeavors stimulates further interest, 
and the more utilitarian molecular approaches eventually gain wide popular- 
ity. Usually, after a period of several years, the enthusiasm crests, and a new 
wave of interest may focus on another newly introduced assay procedure. 
Typically, the earlier methods are not abandoned, but merely become incor- 
porated into the growing pool of molecular techniques that find continued 
application in studies of organismal biology, natural history, and evolution. 

Protein electrophoresis was the first molecular approach employed 
widely in the field. The utilitarian allozyme markers it reveals are protein 
variants that behave in straightforward Mendelian fashion and, hence, are 
interpretable as simple allelic products of a gene. (The term “isozyme” refers 
to the broader class of all protein variants observed on electrophoretic gels, 
including heteromeric products of multiple loci, post-translational variants, 
and other protein alterations.) Allozyme methods were introduced in the 
mid-1960s and dominated molecular ecology and systematics for the next 10 
years (see early reviews in Avise 1974, 1983; Buth 1984; Gottlieb 1977; Whitt 
1983, 1987). Today, protein electrophoresis remains a simple and useful 
method for generating molecular markers. 

The next technique to become popular in molecular ecology and evolu- 
tion involved analyses of "restriction fragment-length polymorphisms" 
(RELPs) in DNA. For both technical and conceptual reasons, mitochondrial 
(mt) DNA received most of the early attention. Mitochondrial approaches 
dominated the field during the late 1970s and 1980s (see early reviews in 
Avise 1986, 1991a; Avise and Lansman 1983; Birley and Croft 1986; Harrison 
1989; Moritz et al. 1987; Palmer 1990; Wilson et al. 1985), much as had 
allozyme studies a decade earlier. Today, strong interest in mtDNA contin- 
ues, although the markers usually now come from direct sequencing of par- 
ticular mitochondrial genes. In the middle and late 1980s, another wave of 
excitement attended RFLP analyses as applied to hypervariable nuclear 
DNA regions known as minisatellites. Because of the power of these multi- 
locus assays to distinguish individuals, this class of procedures became 
known as "DNA fingerprinting” (Burke 1989; Hill 1987; Jeffreys et al. 
1985a,b, 1988a; Kirby 1990). 

Another revolution began at about that same time with the introduction 
of the PCR (polymerase chain reaction) technique for in vitro amplification 
of specific DNA fragments (Erlich and Amheim 1992; Erlich et al. 1991; 
Mullis 1990; Mullis et al. 1986; Saiki et al. 1988; White et al. 1989). This dis- 
covery spurred at least three major breakthroughs in marker acquisition. 
First, when coupled with further development of amplification primers 
(Kocher et al. 1989) and improved laboratory methods for rapidly sequenc- 





zd te rtt 





The History of Interest in Genetic Variation 51 


ing DNA fragments (Innis et al. 1988; Scharf et al. 1986; Wrischnik et al. 
1987), PCR-based approaches afforded direct access to the vast phylogenet- - 
ic information content of nucleotide sequences. The many thousands of PCR 
"primers" that are now available greatly facilitate direct sequencing of a 
wide variety of nuclear and cytoplasmic loci. Second, when used to ampli- 
fy an abundant class of newly discovered microsatellite loci (Litt and Luty 
1989; Tautz 1989; Weber and May 1989), PCR methods could be used to tap 
another vast wellspring of genetic polymorphism. Microsatellite loci contain 
variable numbers of tandem repeat units, each about 2-5 nucleotide pairs 
long, and they typically far surpass allozyme loci in heterozygosity levels 
and numbers of alleles per locus (Figure 2.6). Accordingly, these molecular 
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Figure 2.6 Genetic variation at microsatellite loci. Distributions are shown for 
(A) heterozygosity values and (B) numbers of alleles per locus at each of 524 
microsatellite loci surveyed within local natural populations of 78 animal species. 
Note in particular the high heterozygosity levels compared with those typifying 
most allozyme loci (see Figure 2.1). (After DeWoody and Avise 2000.) 
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BOX 2.4 A Brief Chronology of Some Important Events 
Relevant to the (Remarkably Recent) History of 
"Molecular Markers 





1944 Avery, MacLeod, and McCarty provide tor Mund evidence that 

st DNA is the genetic material. ' 

1953 Watson and Crick propose a molecular model for DNA structure. 

1955 Smithies uses starch- -gel Ae ophogen to identify protein polymor- 

Hi phisms. “ 
1963 :;- Margoliash determines amino ada sequences for cytochrome c in sev- 
. eral taxa and generates the first phylogenetic tree for a specific gene 
; à product: ©. 
1966. . Several independent researchers use electrophoretie methods and 
“+ histochemical énzyme stains to assess levels of genetic variability i in 
Pen BaF e animal populations and humans. > 
1967 `` Sarich and: Wilson provide an early application of protein: immunologi- 
cal methods and discovera far more recent shared ancestry for humans - 
_ >, and greatapes‘than previously suspected: * >. 

1968 . Kimura proposes the neutral theory of molecular bvaluğdn: Meselson 
and Yuan isolate'and:characterize the first specific restriction enzyme. 
Britten and Kohne use DNA tybngizatpn i methods to characterize ani- 

Samal genomes? i- 
1971. Publication of the first periodical devoted to molecular evolution 
wur Qaurnalof Molecular Évolitiori). 

1975... Southern describes a method forthe Patter of DNA fragbncnte to. 

^.^.  nitrocéllulose filters, hybridization to radioactive probes, and detection 
£75. of fragments by autotadiography. : 

1977 ^. Maxam,-and Gilbert and also Sanger and colleagues describe laborato- ; 

4, sty. methods for DNA sequencing. 
, 4978: "Maniatis and colleagues develop à procedure for gene isolation that 
Și T involves construction and t sereng o cloned libraries of eukaryotic 
DNA: n nE r 


markers have found wide application in wildlife forensics, parentage analy- 
ses, and other microevolutionary studies (Goldstein and Schlótterer 1999). 
Finally, because the PCR can amplify DNA sequences from minuscule 
amounts of tissue, or even from some well-preserved fossils, it has extend- 
ed molecular applications to a much wider biological arena (see Chapters 6, 
8, and 9). 

As elaborated in Chapter 3, several other laboratory methods have also 
made significant contributions to molecular ecology and evolution. 
Prominent among these have been protein immunological comparisons, 
which provided some of the first evidence for molecular clocks (Benjamin et 
al. 1984; Goodman 1963; Maxson and Maxson 1986, 1990; Sarich and Wilson 
1966, 1967; Wilson et al. 1977), and DNA-DNA hybridization methods 
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1979 Avise and colleagues and also Brown and colleagues introduce . 
3 mtDNA approaches to analyses of natural populations. 

1981 Palmer and colleagues. help initiate an important series of pipe uti- 

lizing cpDNA for phylogenetic reconstruction in plants. : 

1983 . The journal Molecular Biology and Evolution is launched. 

1985. Jeffreys and colleagues develop multi-locus DNA fingerprinting and | 
point out its potential for forensic science. Saiki, Mullis, and colleagues 
report the enzymatic in vitro pmplilication of DNA, via the E polymerase 

chair: reaction. ` 

1987 . Avise and colleagues characterize phylogeography as asa new: approach 

Faget to population genetics. 
,1989. .. Kocher and colleagues report the discovery of corisctied’ PCR primers 
that can be employed to amplify: mtDNA segments from: mary.species. 
Several seminal papers identify the utility of microsatellite loci asa 
i Source of highly polymorphi molecular markers.. i. s 
1992, Periodicals devoted to'evolutionary applications for molecular bak 
ers, such a$ Molecular Ecology, Molecular Phylogenetics and Evol: ion,- i 
>a ~and Molecular Marine Biology and Biotechnology, prolifévate. < 

1996 . -Edited volumes by Avise and Hamrick and by Smith and Wa 

22- emphasize roles for molecular matKers in conservation biology =: 

2000 =. -The journal Conservation Genetics is faunched. The first textbook sum- 

; ."marizing the field of phylogeography: Gy Avise)'is published. - Kor 
2001 - Draft sequences of the human genome are published: ander ind; 
vi^ colleagues, and by MET and colleagues.’ tae k Y 
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(Britten and Kohne 1968; Doty et al. 1960), which had special influences on 
the systematics of particular taxonomic groups, such as insects (Caccone 
and Powell 1987; Caccone et al. 1988a,b), birds (Sibley and Ahlquist 1990), 
marsupial mammals (Springer and Kirsch 1991), and hominoid primates 
(Caccone and Powell 1989; Sibley and Ahlquist 1987). 

Given the burgeoning interest today in molecular ecology and phyloge- 
netics, it is useful to remain cognizant of the remarkably shallow history of 
these scientific disciplines. An abbreviated chronology of salient events in the 
development and application of molecular markers is provided in Box 2.4. 


SUMMARY 


1. The study of ecology and evolution from a molecular perspective is a fairly 
recent enterprise, beginning in substantive form only in the latter half of the 
twentieth century. 


Pam 











54 Chapter 2 


2. Several fundamental controversies have dominated molecular evolution and 


w 


related fields. These controversies include the classical-balance debate on the 
magnitude of genetic variation, the neutrality-selection debate on the adaptive 
significance of molecular variation, and arguments over the relative utility of 
molecular versus morphological characters in phylogenetics and systematics. 


: : P 1 | 
. These grand controversies raised issues that certainly are germane, but are sel- 


dom crucial, to the interpretation of molecular polymorphisms as genetic 
markers in ecology and evolution. Indeed, to a considerable extent, the 
debates diverted attention from the many utilitarian applications for protein 
and DNA polymorphisms in investigating natural history and organismal 
phylogeny. Only in relatively recent years have large numbers of researchers 
begun to shift their primary focus to these latter, highly informative arenas. 


- Several waves of excitement in molecular ecology and evolution have fol- 


lowed the introduction of new laboratory techniques. Among the most influ- 
ential methods have been protein electrophoresis in the late 1960s and 19705, 
RFLP analyses of mitochondrial DNA in the late 1970s and 1980s, multi-locus 
DNA fingerprinting in the late 1980s, and PCR-mediated DNA sequencing as 
well as microsatellite assessments, beginning mostly in the 1990s. 

















Molecular Techniques 


Perhaps nowhere has the power of the scientific method been more brilliantly 
demonstrated than in the development of procedures for the study of the 
chemistry of life. 

M. O. Dayhoff and R. V. Eck (1968) 


Many different classes of laboratory assays can be employed to reveal molecular 
genetic markers. Detailed protocols are published elsewhere (key literature is 
cited herein), but in practice there is no substitute for hands-on training “at the 
bench" under the guidance of an experienced practitioner. Thus, this chapter will 
merely outline laboratory methods, focusing on the most popular and the most 
historically important. These methods will be treated roughly in the chronologi- 
cal order of their introduction to the field. Another intent of this chapter is to 
emphasize the conceptual basis of, the nature of the genetic information provid- 
ed by, and the rationale underlying each of the various molecular approaches that 
has had a major impact on molecular ecological and evolutionary studies. 


Protein Immunology 


Crude immunological assays (using unpurified proteins in simple precipitin 
tests) were introduced a century ago (Nuttall 1904), but the development of 
micro-complement fixation (MCF) methods in the 1960s allowed outcomes to be 
quantified in terms of immunological distances (ID values), which proved to be 
phylogenetically informative. The MCF procedure is outlined in Figure 3.1 
(details are in Champion et al. 1974; Maxson and Maxson 1990). First, a protein 
such as albumin is purified from a reference species using standard biochemistry. 
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1. Purify protein from reference species 2. Inject into rabbit 
3. Collect antibodies 4. Prepare extract from test species 
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Figure3.1 General protocol for immunological comparison of proteins by 
micro-complement fixation. (After Wilson 1985.) 









This highly purified protein is injected several times into a host species 
(often a rabbit) over several months. One week after the last injection, host 
antiserum is collected and standardized to a given level of reactivity under 
specified MCF conditions. When an MCF assay is to be performed, the anti- 
serum is mixed with varying concentrations of antigen from diluted plasma 
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(purified protein is not required) of a test species. Included in each reaction 
is a group of proteins (“complement”), normally found in vertebrate serum, 
that become trapped in developing lattices of the antigen-antibody com- 
plex. A spectrophotometer is then used to quantify the amount of comple- 
ment “fixed”: The greater the cross-reactivity between antibody and anti- 
gen, the more fixation. The MCF technique has been shown to be capable of 
detecting even single amino acid replacements in the antigenic site of a chal- 
lenging protein. For the albumin molecule in particular, ID as measured by 
MCE has been shown to be a linear function of the number of amino acid 
substitutions between the reference and test species (Maxson and Maxson 
1986; Prager and Wilson 1993). 

Albumin was the usual protein of choice in vertebrate MCF studies for 
several reasons: it is abundant, ubiquitous, and easily purified; it consists of 
one subunit encoded by a single gene; the molecule has many (25-50) major 
antigenic sites at which amino acid substitutions were detectable by MCF 
(Benjamin et al. 1984); and it typically proved useful for phylogenetic stud- 
les at the taxonomic levels of families or genera. Other proteins sometimes 
used in MCF assays included lysozyme, ovalbumin, and transferrin in ver- 
tebrates (Leone 1964; Prager and Wilson 1976; Wright 1974), acid phos- 
phatase, glycerophosphate dehydrogenase, and larval proteins in inverte- 
brates (Beverley and Wilson 1985; Collier and MacIntyre 1977; MacIntyre et 
al. 1978), and alkaline phosphatase in bacteria (Cocks and Wilson 1972; 
Maxson and Maxson 1990). 

For a complete assessment of the taxa under consideration, MCF requires 
that antibodies be produced from each species. Otherwise, missing elements 
in the pairwise ID matrix compromise phylogenetic reconstruction. 
Generating and testing antisera for multiple species is time-consuming, but 
offers the added advantage of permitting evaluations of reciprocity (anti-A vs. 
B; anti-B vs. A). Differences between reciprocal outcomes provide a measure 
of experimental error in the MCF procedure (Maxson and Wilson 1975). 

Protein immunology (especially via the MCF technique) was one of the 
earliest methods of molecular phylogenetics and provided some of the first 
evidence for molecular clocks. Although the technique has been superseded 
by other approaches since the fate 1980s, it remains of interest both for its 
historical significance and for its conceptual and operational distinctness 
from other molecular methods. 


Protein Electrophoresis 


Protein electrophoretic techniques were introduced in the mid-1960s, and 
they still remain a simple, popular, and powerful workhorse for generating 
Mendelian nuclear markers in many ecological ánd evolutionary applica- 
tions. Detailed descriptions of laboratory techniques can be found in Baker 
(2000), Harris and Hopkinson (1976), Müller-Starck (1998), Murphy et al. 
(1996), Selander et al. (1971), and Shaw and Prasad (1970). 
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Protein electrophoresis takes advantage of the fact that non-denatured 
proteins with different net charges migrate at different rates through starch 
or acrylamide gels (or other supporting media such as cellulose acetate 
strips) to which an electric current is applied (Figure 3.2). These charge fea- 
tures stem primarily from the three amino acids with positive side chains 
(lysine, arginine, and histidine) and the two with negative side chains 
(aspartic acid and glutamic acid). A protein’s net charge, which varies with 
the pH of the running condition, determines the protein’s movement 
toward the positive pole (anode) or negative pole (cathode) in a gel. Protein 
size and shape can also interact with a gel’s pore size to influence migration 
properties. 

Because of their low cost, safety, and ease of use, gels made from 
hydrolyzed potato starch are popular. The starch-gel electrophoresis (SGE) 
procedure begins with extraction of unpurified water-soluble proteins from 
a particular source (leaves, roots, liver, heart, blood, skeletal muscle, etc.). 
The extract from each specimen is absorbed onto a paper wick, and 20 or 
more such wicks are placed side by side along a slit (the origin) in the gel. 


PPP : 
1. Dissect tissues 2.Homogenize 3. Centrifuge, collect supernatant 


Power supply 





4. Electrophorese 5. Stain gel slice 6. Score population 


Figure 3.2 General protocol for protein electrophoresis. 
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The gel is placed in a buffer tray connected to an electrical power supply, 
and electrophoresis proceeds over several hours. The gel is then removed 
and sliced horizontally, and each wafer-thin slice is incubated with a histo- 
chemical stain specific for a particular enzyme (Hunter and Markert 1957). 
Each stain contains a commercially available substrate for the enzyme, nec- 
essary cofactors, and an oxidized salt (usually nitro-blue tetrazolium, or 
NBT). For example, the staining solution for lactate dehydrogenase (LDH) 
includes lactic acid (the substrate), nicotinamide adenine dinucleotide (the 
cofactor), phenazine methosulfate (an intermediary catalyst), and NBT. At 
each position in the gel to which LDH from a specimen migrated, a reaction 
is catalyzed whereby lactic acid is oxidized to pyruvic acid and the salt is 
reduced to a blue precipitate visible to the naked eye as a discrete band. 
Such bands collectively are the enzyme's "zymogram" pattern, and they 
usually can be interpreted in simple genetic terms. 

Histochemical stains single out the products of particular genes from 
among the thousands of other undetected proteins also migrating through a 
gel. When coupled with improvements in electrophoretic procedures and 
media, such stains eliminated the need for laborious protein purification 
procedures that had always precluded direct sequencing of amino acids for 
most population applications. Recipes for more than a hundred allozyme 
stains are available, but not all enzymes resolve well for a given taxon, so a 
typical multi-locus SGE survey involves successful assay of about 10-30 
enzymes, perhaps encoded by 15-50 loci (some enzymes are encoded by 
multiple genes). 


Mendelian markers 


Often, it is quite feasible to assay hundreds or even thousands of individu- 
als ina given SGE study. One starch gel carrying extracts from about 25 indi- 
viduals can be sectioned into perhaps five replicate slices, and each slice can 
be incubated with a different stain. Twenty such gels per day can be run in 
an active laboratory. Thus, a total of 2,500 genotypes (25 individuals x 20 
gels x 5 enzyme stains) could be scored with just several hours of effort. This 
is a conservative estimate because many enzyme stains reveal genotypes for 
two, three, or four genes whose products catalyze the same reaction. Such 
masses of genetic data are incredible by the standards of the pre-molecular 
era, when elucidation of even a handful of Mendelian genotypes in a few 
specimens required monitoring cross-generation inheritance patterns. 
Indeed, shortly after the onset of the allozyme revolution in the mid-1960s, 
vastly more genotypic data from natural populations had been gathered 
than in all the preceding 100 years since Gregor Mendel. 

Zymogram patterns are normally interpretable as Mendelian genotypes 
at specific loci. The Mendelian basis of an observed polymorphism may be 
verified in several ways. First, most enzymes have a known quaternary 
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structure that predicts characteristic gel-band signatures for Mendelian 
variants (Figure 3.3). For example, phosphoglucomutase (PGM) is a 
monomer, meaning that it is composed of a single polypeptide with enzy- 
matic activity. Thus, a PGM homozygote shows one band on gels, and a 
heterozygote shows two bands (one produced by each of the two PGM alle- 
les). Glucose-6-phosphate isomerase (GPI) is an example of a dimer (a mol- 
ecule whose catalytic activity requires the union of two polypeptide sub- 
units). Thus, a GPI heterozygote displays a three-band gel profile (with the 
middle band about twice the intensity of the flanking bands), reflecting 
random associations between polypeptides produced by the two alleles. 
Purine-nucleoside phosphorylase (PNP) is an example of a trimer, and 
LDH is a tetramer, such that heterozygotes for such loci can exhibit four- 
band and five-band zymograms, respectively, with characteristic band 
intensities. Sometimes, polypeptide subunits encoded by two loci join 
together to form active enzymes. Zymogram patterns in such cases are 
more complex, but nonetheless readily interpretable using an extension of 
the logic described above. 


Single locus, monomer Single locus, dimer Single locus, trimer Single locus, tetramer 
AA AA’ A'A' AA AA’ FA AA AA’ AA AA AA’. A'A’ 
—a —a — aa —aa — aga — aga — agaaa — — aana 

— ag — aad’ — agaa’ 
=f oat — a'a' — a'a' — aw — aan’a’ 
—a'aa’ — aaa — aa'a'a' 
— a'a'a'a'. — a'a'a'a' 
Two loci, monomer Two loci, dimer Two loci, tetramer 
AA AA’ ÆA AA AA’ AYA’ AA AA A'A' 
BB BB BB BB BB BB BB BB BB 
—a—a — aa — aa — aana g^" Aa 
— aa’ enl. mm — a'a'a'a' 
—a’ —a' — g'a’ — a'a' aat — — a'ata'b 
gd ze p — a'b = aabb `y dab V. a'a'bb 
—abbb = — a'bbb 
—b —b —b —bb — bb — bb — bbbb — bbbb —— bbbb 





Figure 3.3 Single-locus and multi-locus zymogram patterns. Lowercase letters 
indicate polypeptide subunits produced by alleles A or A’ at locus 1, and by allele 
B atlocus 2; uppercase letters indicate diploid genotypes. Shown in each central 
lane (column) is:the expected zymogram pattern for a heterozygous individual 
when the enzyrne in question is monomeric, dimeric, trimeric, or tetrameric, and in 
cases in which either one or two loci are involved. For example, for a tetrameric 
protein encoded by a single gene, a heterozygote should exhibit five gel bands 
with the following subunit compositions: aaaa, anaa’, aaa’a’, aa'a'a', and a'a'a'a’. If the 
two alleles produce similar polypeptide concentrations and subunit assembly is 
random, then intensities of the five respective bands should appear in the approxi- 
mate ratio 1:4:6:4:1. 
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A second approach to verifying a Mendelian basis for zymogram varia- 
tion entails experimental crosses. For example, heterozygous progeny with 
predictable zymogram patterns should emerge from a cross between two 
alternative homozygotes, and backcrosses of such progeny to either parent 
should produce heterozygotes and appropriate homozygotes in nearly 
equal fréquency. In the early years of protein electrophoretic surveys, such 
experimental validation of genotypes was important, but as experience with 
the simple genetic bases of banding patterns for commonly used enzymes 
grew, routine direct corroboration became less critical. Finally, population 
genetic considerations can help verify a Mendelian basis for zymogram 
variation. For outcrossed populations that are free from microspatial subdi- 
vision, observed frequencies of allozyme genotypes normally agree with 
frequencies predicted under Hardy-Weinberg equilibrium. 

Most allozyme polymorphisms revealed by electrophoresis are proba- 
bly attributable to nucleotide substitutions causing replacements of charged 
amino acids, but direct molecular confirmation of this is seldom available. If 
distinct allozyme alleles at a locus are not further characterized at the molec- 
ular level, they must be viewed as qualitative multi-state traits whose phy- 
logenetic order cannot be safely inferred from the observable property: elec- 
trophoretic mobility. Allozyme data consist, then, of specified genotypes at 
each of perhaps dozens of typically unlinked nuclear loci (Pasdar et al. 1984; 
Shows 1983; Wheat et al. 1973). Apart from their many applications as 
Mendelian markers in such areas as parentage assessment and gene flow 
estimation, allozyme frequencies at multiple loci can also be employed to 
compute quantitative estimates of genetic distance between populations or 
species. 


Idiosyncratic protein features 


Occasionally, protein electrophoresis reveals additional kinds of phyloge- 
netic markers. For example, among 26 avian orders and more than 40 
assayed taxonomic families, only woodpeckers (Picidae), honeyguides 
(Indicatoridae), barbets (Capitonidae), and toucans (Rhamphastidae) con- 
sistently exhibited a three-band zymogram pattern for malate dehydroge- 
nase (MDH). This unique gel pxofile, perhaps attributable to a gene dupli- 
cation event (Avise and Aquadro 1987), helped settle a long-standing debate 
about whether these superficially different birds are indeed phylogenetical- 
ly allied, as their traditional placement within Piciformes would suggest. 
However, apart from identifying this single clade (monophyletic group), the 
MDH marker was of no further use for avian phylogenetic assessment 
(unlike allozyme allele frequency data, which provided a more comprehen- 
sive picture of piciform relationships; Lanyon and Zink 1987). Thus, the rar- 
ity of idiosyncratic allozyme characters is both a phylogenetic blessing and 
a curse—rarity suggests monophyly and implies that evolutionary clades 
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earmarked by eccentric molecular features may be quite secure, but it also 
means that few such genetic markers will normally be present in a given 
data set. 

Many enzymes are encoded by two or more loci that arose through gene 
duplications via polyploidy, aneuploidy, or regional intra-chromosome 
duplication (Buth 1983; Macintyre 1976; Ohno 1970). Their zymogram pat- 
terns usually are interpretable from rules governing polypeptide assembly s 
into functional enzymes with known quaternary structures (see Figure 3.3). 1 
Using such evidence as well as more detailed DNA-level characterizations of 
various diploid species of Clarkia plants, several gene duplications have been 
discovered (Ford and Gottlieb 1999; Ford et al. 1995; Gottlieb 1988; Soltis et 
al. 1987) and used to identify putative clades (Gottlieb and Ford 1996). On the 
other hand, such phylogenetic exercises (Sytsma and Smith 1992) entail sev- 
eral possible complications: the possibility of convergent origin of a duplicate 
gene in independent lineages; post-duplication gene silencing (perhaps on 
multiple occasions in different lineages) of either member of a duplicate pair 
(Ford and Gottlieb 2002; Gottlieb and Ford 1997); and the possibility that a 
duplicated locus is a retained ancestral condition at the taxonomic level 
examined. The latter two possibilities have indeed been documented in 
Clarkia (Gottlieb 1988). All three possibilities emphasize the importance of 
distinguishing orthology from paralogy (see Chapter 1) when drawing phy- 
logenetic conclusions from multi-gene families. 

The protein products of duplicated genes usually diverge in structure 
and regulatory control after the duplication event and may show striking 
ontogenetic changes or tissue specificities of potential relevance to phyloge- 
netic assessment. For example, whereas most vertebrates express a single 
GPI gene, bony fishes express two unlinked GPI loci, one predominantly in 
skeletal muscle and the other in liver, where they perform somewhat differ- 
ent functions (Avise and Kitto 1973; Whitt et al. 1976). All vertebrates exam- 
ined (with the exception of lampreys) have muscle- and heart-specific LDH 
expression involving two genes (Markert et al. 1975). As gauged by zymo- 
grams, the assembly of heterotetramers between these LDH loci is sometimes 
nonrandom, presumably due to taxon-specific genetic regulatory influences 
(Murphy 1988; Sites et al. 1986). Some birds (doves) and mammals have a 
third LDH gene expressed only in primary spermatocytes (Blanco and 
Zinkham 1963; Matson 1989; Zinkham et al. 1969). Bony fishes also carry a : 
third LDH gene, expressed in a variety of tissues in primitive fish species but è 
predominantly in the eyes or liver of advanced teleosts (Horowitz and Whitt 
1972; Markert and Faulhaber 1965; Shaklee et al. 1973; Whitt et al. 1975). It is 
doubtful that this fish gene is orthologous to the third locus in birds or mam- 
mals (Fisher et al. 1980; Quattro et al: 1993). Other multi-locus allozyme sys- 
tems that have been studied extensively with regard to gene expression pat- ; 
terns and phylogeny include malate dehydrogenase, glycerol-3-phosphate : 
dehydrogenase, creatine kinase (Buth et al. 1985; Fisher and Whitt 1978; 

Fisher et al. 1980; Philipp et al. 19832), and the globin superfamily of oxygen- 
carrying molecules (Dayhoff 1972; Doolittle 1987). 
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Duplicate genes are also subject to evolutionary silencing or loss, and 
these outcomes can be phylogenetically informative. For exàmple, Ferris 
and Whitt (1978, 1979) used patterns of enzyme loss and change in gene 
expression to reconstruct phylogeny in catostomid suckers, a group of 
freshwater fishes that underwent a polyploidization event approximately 
50 million years ago and subsequently became "diploidized" at approxi- 
mately 50% of assayed structural genes. The rationale for this phylogenet- 
ic reconstruction was as follows. Immediately following the polyploidiza- 
tion event, all loci in the ancestral sucker genome must have been dupli- 
cated, and the "primitive" condition from that point forward became pres- 
ence of each duplicate gene. As mutations accumulated, some genes lost 
expression (becoming pseudogenes), whereas others diverged in structure 
and function (Ferris and Whitt 1977). These processes presumably were 
nearly irreversible (but see Buth 1979, 1982), such that taxa sharing the 
derived states (loss or alteration of gene expression) probably are mono- 
phyletic (assuming that identical changes did not occur independently in 
separate evolutionary lineages). 

Not all idiosyncratic gene expression patterns are phylogenetically 
informative, however. For example, Mindell and Sites (1987) assayed tissue 
expression profiles for about 30 allozyme loci in species representing two 
avian orders (Charadriiformes and Passeriformes), and they observed 
numerous overt discrepancies with accepted taxonomy. These authors con- 
cluded that "widespread homoplasy, opposite polarities and limited pre- 
dictive capability for the isozyme tissue expression patterns suggest that 
| most may be more useful in studies of gene regulation than in higher level 

taxonomy." A 





DNA-DNA Hybridization 


DNA-DNA hybridization relies on the double-stranded nature of duplex 
DNA and the fact that paired nucleotides on the two complementary 
strands are held together by hydrogen bonds (two coupling each ade- 
nine-thymine base pair and three coupling each guanine-cytosine). These 
hydrogen bonds are the weakest links in DNA, so when native DNA is heat- 
| ed to 100°C in solution, the duplexes dissociate or “melt” into single strands. 
When the sample is cooled, these strands collide by chance, and those of 
complementary sequence reassociate into duplex molecules as hydrogen 
bonds re-form between matched bases. A rapidly reassociating component 
represents repetitive DNA, because these homologous strands are most 
numerous and collide most frequently. This fraction is removed. The 
remaining fraction is composed of single- or low-copy sequences in the 
genome. These DNA strands are then added to a mixture under conditions 
| permitting duplex formation. The mixture may contain DNA strands from 
a single sample or species (yielding homoduplexes) or it may contain 
strands from two species (yielding heteroduplex molecules). The final step 
in the hybridization protocol involves characterizing the thermal stabilities 
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of these homo- and heteroduplexes by gradually raising the temperature 
and monitoring the course of molecular dissociation to single strands. The 
thermal stability exhibited by any duplex depends largely on the similarity 
of nucleotide sequences in its two strands, because only properly -paired 
bases form hydrogen bonds. The measured difference in thermal stability 
between homoduplexes and heteroduplexes provides a quantitative esti- 
mate of the genetic divergence between the species examined. 

More details about DNA hybridization are as follows (Figure 3.4;-exact 
methods are described by Werman et al. 1996): 


* First, DNA is extracted from the nuclei of cells, separated from RNA 
and proteins, and physically sheared into fragments averaging 500 
nucleotides in length (to reduce viscosity and to permit subsequent frac- 
tionation of repetitive from low-copy DNA). 


* The sheared fragments are boiled, cooled, and their reassociation kinet- 
ics employed to remove the highly repetitive fraction (Britten and 
Kohne 1968; Britten et al. 1974). This is accomplished by incubating the 
DNA in solution at about 50°C for a short time, such that repetitive 
sequences preferentially anneal and most low-copy sequences remain 
unpaired. 

* This solution is passed through a hydroxyapatite column that binds 
double-stranded DNA only. Single-stranded DNA that passes through 
the column is labeled with radioactive iodine and becomes known as 
the tracer. 


* Tracer DNA is then mixed with a much larger amount of unlabeled 
DNA (called the driver) from the same or a different species. This mix- 
ture is incubated at 60?C for several days to form hybrid molecules that 
have one labeled and one unlabeled strand. 


* The sample is then placed on a hydroxyapatite column and gradually 
heated at 2.5°C increments over a 60-95°C range. At each temperature 
increment, additional duplexes that have melted (a function of degree of 
base-pair mismatch) are washed from the column into a beaker. Counts 
of radioactivity in the beakers record the amount of duplex DNA that 
melted at various temperatures. 


Raw data from DNA-DNA hybridization consist of “thermal elution 
profiles” (Figure 3.5) that summarize observed percentages of dissociated, 
single-stranded DNA as a function of melting temperature. From these melt- 
ing curves, quantitative estimates of the degree of base-pair mismatch 
between DNAs under comparison can be derived (Britten 1986; Kirsch et al. 
1990; Sarich et al. 1989; Sheldon and Bledsoe 1989). One such measure of 
genetic divergence is based on the temperature (T,) at which 50% of the 
hybrid molecules remain in duplex condition. The difference in T,, values 
between homoduplex and heteroduplex melting profiles (AT,,) provides a 
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Figure 3.4 General protocol for DNA-DNA hybridization. 
(After Sibley and Ahiquist 1986.) 
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Figure 3.5 Thermal elution profiles from DNA-DNA hybridization. Shown are 
cumulative melting curves for the.single-copy fraction of nuclear DNA in several 
flightless ratite birds. (A) Homoduplex DNA of emu (Dromaius novaehollandiae) 
(solid squares), and heteroduplexes between that species and southern cassowary 
(Casuarius casuarius) (open squares), greater rhea (Rhea americana) (open circles), 
ostrich (Struthio camelus) (solid triangles), and chicken (Gallus gallus—a non-ratite Z2 
outgroup) (solid circles). In these assays, cassowary appears genetically closest to 3 
emu, followed in order by rhea, ostrich, and chicken. (B) Homoduplex DNA of ee 
ostrich (solid squares) and its heteroduplex with rhea (open squares). Note that 

although the melting curves involving rhea and ostrich are nearly identical when 

compared with emu (panel A), this does not necessarily imply that rhea and ostrich 

are genetically close to one another. Indeed, differences between the melting curves 

in panel B indicate a large genetic distance between rhea and ostrich. (After Sibley 

and Ahlquist 1990.) 
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genetic distance estimate. When such estimates are available for pairwise 
comparisons among three or more species, they can be used as a basis for 
phylogenetic reconstruction. 

The relationship between AT, and percent base-pair mismatch is 
approximately linear (Britten et al. 1974; Caccone et al. 1988b; Kohne 1970), 
as shown by studies of thermal stability of synthetic oligonucleotides or 
other sequences of known base composition (Bautz and Bautz 1964; Hutton 
and Wetmur 1973; Laird et al. 1969; Springer et al. 1992). As a working rule 
of thumb, each increase in AT, of 1°C translates to an additional 1% or 2% 
base-pair mismatch in DNA (Britten 1986; Caccone et al. 1988b; Koop et al. 
1986; Powell et al. 1986). 

Because the DNA hybridization method in effect yields a mean genetic 
difference across a large fraction (the low-copy portion) of any two genomes 
compared, it was sometimes promoted as one of the strongest possible 
sources of phylogenetic information (Sibley and Ahlquist 1990). Indeed, inter- 
taxon genetic distances from DNA hybridization have been shown to corre- 
late reasonably well with those from direct sequencing of some mitochondri- 
aland nuclear genes (van Tuinen et al. 2000). However, reservations about the 
approach were voiced as well: The raw data consist solely of distance values, 
rather than discrete character states amenable to cladistic analysis (see below), 
and the factors that can affect the kinetics of hybridization (such as differences 
in base composition, DNA fragment size, and genome size) are incompletely 
understood. Some of these factors nonetheless are partially controlled in most 
DNA hybridization studies. For example, effects of base composition differ- 
ences (numbers of A-T versus C-G pairs) among sequences can be ameliorat- 
ed by use of chaotropic solvents (Werman et al. 1996), and genetic compar- 
isons can be confined to particular taxonomic groups (such as birds), within 
which confounding variables such as pronounced differences in genome 
structure or base composition should be minimized. 

The development of automated thermal elution devices (e.g., the 
"DNAnalyzer" of Sibley and Ahlquist 1981) greatly expedited the process of 
gathering DNA hybridization data. Indeed, the honor for the largest set of 
animals included in any molecular systematic survey might still belong to 
Sibley and Ahlquist (1990), who conducted nearly 30,000 DNA~DNA 
hybridizations on 1,700 avian species. Although a few research laboratories 
were devoted to DNA hybridization assays, the technique was not other- 
wise widely employed by molecular systematists. The method nevertheless 
had a large impact on ornithology and some other fields, and it remains of 
interest for the contrasts it provides with most other genetic approaches 
(which typically focus on molecular details at far fewer loci). 


Restriction Analyses 


The discovery of restriction endonucleases (Linn and Arber 1968; Meselson 
and Yuan 1968) revolutionized molecular biology. Type. ll restriction 
enzymes (Kessler 1987) cleave duplex DNA at particular oligonucleotide 
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séquences, usually either 4, 5, or 6 base pairs in length (Roberts 1984). For 
example, EcoRI (named after the bacterium Escherichia coli from which it was 
isolated) acts like a precise scalpel that cuts double-stranded DNA wherev- 
er the non-methylated sequence 5'-GAATTC-3' occurs. Several hundred 
such enzymes, most with different recognition sequences, have been isolat- 
ed and characterized from various bacterial strains and are commercially 
available. In bacteria, these enzymes protect against invasion by foreign 
DNA (host DNA is protected by bacteria-specific methylation systems). In 
DNA laboratories, restriction enzymes find wide applicatíon in assays of 
restriction fragment-length polymorphisms (RFLPs). 

RFLP analyses involve cutting (restricting) DNA with one or more 
endonucleases, separating the resulting fragments according to molecular 
weight by gel electrophoresis, and visualizing the size-sorted fragments. 
Differences among individuals in these "digestion profiles" may result from 
base substitutions within cleavage sites, additions or deletions of DNA, or 
sequence rearrangements, with each source of variation producing charac- 
teristic banding changes. Three important and partially interrelated vari- 
ables in these assays are the electrophoretic media employed, the means of 
fragment visualization, and the choice of. DNA to be analyzed. These gener- 
al considerations will be discussed first, and methodological details for par- 
ticular applications will be added afterward. 

The usual electrophoretic media are agarose or acrylamide gels. These 
gels form dense matrices through which, under the influence of an electric 
current, small DNA fragments migrate faster than large fragments. At neu- 
tral pH, DNA is negatively charged, and thus moves toward the anode at 
rates determined by molecular size. Agarose gels are used to separate DNA 
fragments in the size range of about 300-20,000 base pairs (bp), whereas 
acrylamide gels optimally separate restriction fragments about 10~1,000 bp 
long. To facilitate estimates of restriction fragment lengths, researchers nor- 
mally include molecular size standards (commercially available) in each gel. 

DNA fragments can be visualized in several ways. Some electrophoret- 
ic assays begin with highly purified DNA isolated from particular sources 
(such as mitochondria), in which case DNA fragments in the gel are 
revealed by chemical stains or radioactivity. When DNA amounts are high = 
(>50 ng per gel band), ethidium bromide provides a convenient agent for 
fragment detection. This chemical binds to DNA in such a way that staining 
intensities are proportional to fragment sizes, so digestion profiles are stoi- 
chiometric. Silver staining is similar and provides greater sensitivity in 
detecting small DNA quantities (< 100 pg; Guillemette and Lewis 1983). In 
highly sensitive “end-labeling” procedures, DNA digestion fragments are 
labeled radioactively with *P- or *S-tagged nucleotides prior to elec- 
trophoretic separation. After a gel is run, it is vacuum-dried and overlaid by 
X-ray film, whose development as an autoradiograph reveals the positions 
to which the DNA fragments migrated. With end-labeling, band intensities 
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are independent of fragment size (because all fragments have two labeled 
ends), and the method is therefore especially useful in revealing small frag- 
2 ments when DNA amounts are limited. 

| Other RFLP assays begin with DNA of heterogeneous classes (e.g., total 

| 

| 





nuclear DNA), with elucidation of DNA fragments from particular genes 
accomplished after electrophoresis by the technique of “Southern hybridiza- 
tion” (Southern 1975). In this method, DNA fragments in a gel are denatured 
in a basic solution and then transferred as single strands (by capillary action 
or electrophoresis) to a nylon or nitrocellulose membrane (Figure 3.6). This 
membrane is incubated with a single-stranded "probe"—DNA previously 
isolated, purified, and radioactively labeled—under conditions in which 
strands that are complementary to those of the probe hybridize with the 
probe to form radioactive duplexes in the membrane. Under high-strin- 
gency conditions, hybridization with distantly related or non-homologous 
DNA is avoided. Thus, the probe in effect picks sequences that are comple- 
mentary and (ideally) homologous to itself from the multitude of undetect- 
ed fragments that also migrated through the gel. Those fragments with 
sequence similarity to the probe then are visualized by autoradiography of 
the "Southern blot." 
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Figure 3.6 General protocol for Southern blotting. (After Burke 1989.) 
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The probe in Southern hybridizations thus identifies particular pieces of 
DNA. This probe may constitute, for example, a single gene from a nuclear 
or cytoplasmic genome, a non-coding stretch of DNA sequence, or an entire 
animal mtDNA. If the probe contains DNA that is present in multiple copies 
in the genome, the Southern blot reveals fragments from all members of the 
family to which the probe hybridized. In some such cases, Southern blots 
reveal highly complex digestion profiles wherein nearly all individuals are 
distinguished by their “DNA fingerprints.” The probes used in Southern 
hybridizations may come from DNA highly purified by physical means 
(e.g., mtDNA isolated via CsCl gradient centrifugation) or from cloning of 
particular genes through biological vectors (Sambrook et al. 1989). For rap- 
idly evolving sequences, a probe's utility may be confined to assays of close- 
ly related species, whereas probes for slowly evolving sequences may cross- 
hybridize across broad taxonomic assemblages. 

Restriction analyses therefore encompass a wide diversity of technical 
approaches, details of which are described by Dowling et al. (1996c), Hames 
and Higgins (1985), Hoelzel (1992), Karl and Avise (1993), Lansman et al. 
(1981), Quinn and White (1987), Sambrook et al. (1989), and Watson et al. 
(1992). Furthermore, different classes of DNA differ dramatically with 
respect to the nature of genetic information provided by restriction analysis, 
so they will be discussed separately in the following sections. 


Animal mitochondrial DNA 


Restriction analyses of mtDNA dominated the field of phylogeography 
from the late 1970s until about 1990, when direct mtDNA sequencing made 
them somewhat passé. Nonetheless, most of the key evolutionary properties 
of mtDNA were discovered in this earlier era, so RFLP assays are of histor- 
ical interest for the lessons they provide as well as for their similarities and 
contrasts with modern direct mtDNA sequencing. 

In most RFLP assays, a crucial initial step in isolation of animal mtDNA 
is the efficient separation of cytoplasm (where mitochondria are housed) 
from cell nuclei (Figure 3.7). Soft tissue such as heart, liver, or ovary is 
minced, gently homogenized, and centrifuged at low speed (700 x g) to 
remove nuclei and cellular debris. Subsequent centrifugation at higher 
speed (20,000 x 2) pellets mitochondria, which are then washed and lysed. 
The next step involves CsCI-EtBr gradient centrifugation at speeds in excess 
of 160,000 x g. Mitochondrial DNA, which appears as a discrete band in the 
resulting gradient, is removed by hypodermic needle and separated from 
remaining contaminants by dialysis. Purified mtDNA can then be used as a 
probe in Southern blots (to reveal mtDNA bands in samples containing het- 
erogeneous DNA), or the purification process described above can be 
repeated for each specimen and the RFLPs elucidated directly by chemical 
stains or radioactive labeling. A rate-limiting step in these mtDNA analyses 
is the lengthy centrifugation process, but various shortcuts have been devel- 
oped for special circumstances, such as when mitochondria-rich cells are 
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Figure 3.7 General protocol for mtDNA restriction site analysis by radioactive 
end-labeling. 


available or only small amounts of mtDNA are needed (Carr and Griffith 
1987; Chapman and Powers 1984; Jones et al. 1988; Palva and Palva 1985; 
Powell and Zuninga 1983). 

In the restriction enzyme era, most researchers working with animal 
mtDNA employed end-labeling procedures. For plant mitochondrial and 
chloroplast genomes, which are much larger, procedures usually involved 
Southern blotting using cloned genes or subsets of the genome as probes. 
More recently, PCR-mediated sequencing of cytoplasmic genomes has large- 
ly supplanted these earlier approaches. Interestingly, however, a serious 
complication has been discovered that can affect Southern blotting and 
PCR-based approaches, but not assays of physically isolated DNA. It turns 
out that both recent and ancient transfers of organelle DNA to the nucleus 
have been relatively common events in evolution, such that many sequences 
related to cytoplasmic genes now also exist as functionless derivatives in the 
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nuclear genome (see Chapter 8). Unless these paralogous pseudogenes are 
properly recognized for what they are (sometimes a tricky proposition; e.g., 
Schneider-Broussard and Neigel 1997), they can potentially confound phy- 
logenetic analyses. It is rather fortuitous that in the RFLP era, mtDNA was 
typically physically purified before assay, such that the unwanted compli- 
cation of mtDNA pseudogenes in the nucleus simply was not encountered. 

Long before whole-genome sequencing of animal mtDNA became 
almost routine (e.g., Miya et al. 2003), the general structure and genetic basis 
of variation in this molecule already were well elucidated (see seminal 
reviews in Attardi 1985; Brown 1985; Cantatore and Saccone 1987; Gray 1989; 
Wallace 1982, 1986; Wolstenholme 1992). With few exceptions, animal 
mtDNA is a closed circular molecule, typically 15-20 kilobases (kb) in length, 
and composed of 37 genes (Figure 3.8A) coding for 22 tRNAs, 2 rRNAs, and 
13 mRNAs specifying proteins involved in electron transport and oxidative 
phosphorylation. Nearly the entire mtDNA genome is involved in coding 
function; introns, large families of repetitive DNA, pseudogenes, and even 
sizable spacer sequences between genes are rare or lacking in most cases. A 
"control region" of about 1 kb initiates replication and transcription. Gene 
arrangement in animal mtDNA is evolutionarily conserved, although differ- 
ences in gene order and content often do distinguish higher taxa, and these 
features have proved useful in macrophylogeny estimation (see Chapter 8). 
Similarly, the detailed structures of mitochondrial tRNA and rRNA genes 
and their products (Cantatore et al. 1987; Wolstenholme et al. 1987), as well 
as differences among taxa in the mtDNA genetic code, have proved to be 
phylogenetically informative (see Chapter 8). 

With regard also to the general mode of animal mtDNA evolution, much 
was learned from early population surveys (see seminal reviews by Avise 
and Lansman 1983; Birley and Croft 1986; Harrison 1989; Moritz et al. 1987; 
Wilson et al. 1985). For example, notwithstanding extensive intraspecific 
sequence variation, most individual animals proved to be homoplasmic, or 
nearly so, meaning that a single mtDNA sequence predominates in all cells 
and tissues of a given specimen (although many exceptions are known; e.g., 
Bermingham et al. 1986; Hale and Singh 1986; Moritz and Brown 1987, and 
references therein). Possible reasons for this characteristic sequence homo- 
geneity within individuals remain poorly understood, but bottlenecks in 
mtDNA numbers in germ cell lineages probably are involved (Birky et al. 
1989; Chapman et al. 1982; Clark 1988; Laipis et al. 1988; Rand and Harrison 
1986; Solignac et al. 1984, 1987; Takahata 1985). Regardless of how such 
sequence homogeneity mechanistically arises, this phenomenon is pragmat- 
ically crucial in virtually all genealogical applications for mtDNA. 

It was also discovered early on that animal mtDNA evolves rapidly at the 
sequence level, due in part to inefficient mutation repair mechanisms (Brown 
et al. 1979; Wilson et al. 1985). Some sequences within the control region 
evolve even faster and are therefore of special utility in high-resolution analy- 
ses of shallow (recent) population structure (e.g., Stoneking et al. 1991). 
Addition/deletion changes in mtDNA are not rare, but most differences 
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Figure 3.8 Major structural features of animal mitochondrial DNA and plant 
chloroplast DNA. (Molecules are not drawn to the same scale.) (A) Human 

| mtDNA, composed of a control region (CR) and genes encoding 2 rRNAs (12S and 
165), 22 tRNAs (open circles), and 13 functional polypeptides. Also shown are sites 
(Or, and Or; ) at which replication is initiated along complementary DNA strands. 
(B) Tobacco (Nicotiana tabacum) cpDNA, composed of large and small single-copy 
regions (LSC, SSC) and a large inverted repeat (IR). Also shown is the position of 
the rbcL gene, DNA sequences of which have figured prominently in phylogenetic 
analyses of plants (see Chapter 8). 
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between sequences reflect point mutations, with a strong initial bias for tran- 
sitions over transversions (Aquadro and Greenberg 1983; Brown and Simpson 
1982; Brown et al. 1982; Greenberg et al. 1983). 

Perhaps most important for genetic marker purposes was the discovery 
that mtDNA is transmitted predominantly through maternal lines in most 
species (Avise and Vrijenhoek 1987; Dawid and Blackler 1972; Giles et al. 
1980; Gyllensten et al. 1985a; Hutchison et al. 1974; see review in Birky 1995). 
Only a few departures from predominant maternal inheritance have been 
uncovered (Avise 1991b; Gyllensten et al. 1991; Kondo et al. 1990), the most 
notable being a "doubly uniparental" hereditary mode in Mytilus and relat- 
ed bivalves wherein females transmit their mtDNA to both sons and daugh- 
ters, whereas males transmit their mtDNA to sons only (Hoeh et al. 1991, 
1997, 2002; H.-P. Liu et al. 1996; Zouros et al. 1992, 1994). In this mollusk sys- 
tem, genetic recombination among mtDNA molecules has also been docu- 
mented (Ladoukakis and Zouros 2001). The Mytilus case has been of special 
interest precisely because it is so exceptional (Zouros 2000). 

In most other taxa, uniparental inheritance clearly limits the opportuni- 
ty for evolutionarily significant genetic recombination between mtDNA 
molecules. Nonetheless, reports of occasional "paternal leakage" of animal 
mtDNA into zygotes have raised the specter of possible recombination 
between distinctive mtDNA genotypes, as have some interpretations of dis- 
equilibrium patterns in population genetic data for this molecule (Awadalla 
et al. 1999; Eyre-Walker et al. 1999; Lunt and Hyman 1997). However, this 
latter class of evidence has been challenged (Arctander 1999; Elson et al. 
2001; Kivisild et al. 2000; Merriweather and Kaestle 1999), and the recombi- 
nation issue has not been definitively resolved. What remains uncontested 
is that animal mtDNA genotypes are mostly maternally transmitted, and 
that if physical recombination does occur, it is unusual or rare in most taxa. 
Thus, mtDNA molecules provide matrilineal markers that are transmitted 
asexually through the pedigrees of what may otherwise be sexually repro- 
ducing species. For simplicity, mtDNA genotypes are therefore often 
referred to as molecular clones or haplotypes, and their inferred evolution- 
ary interrelationships are interpreted as estimates of "matriarchal phyloge- 
ny" (Avise et al. 1979b). From a functional perspective, mtDNA consists of 
about 37 genes, but from a phylogenetic perspective, the entire molecule is 
one linked genealogical unit (i.e., supergene) with numerous alleles. 

The raw data in most mtDNA restriction surveys consisted of fragment- 
length profiles produced by each of a dozen or more restriction enzymes 
(Figure 3.9). Because mtDNA is a closed circle, the number of linear fragments 
equals the number of restriction sites recognized by each endonuclease. A use- 
ful check on gel scoring is provided by mtDNA genome size within a given 
species, to which observed fragment lengths should sum. This feature also 
facilitates direct comparisons of RFLP profiles across studies or among differ- 
ent laboratories (a useful characteristic not fully shared by allozyme methods, 
in which meaningful cross-study comparisons of allelic products require that 
known electromorph standards be run in all gels). A typical mtDNA popula- 
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Figure 3.9 Interpretation of mtDNA digestion profiles. Shown is an autoradi- 
ograph of EcoRI digests of mtDNA from 18 eels (genus Anguilla). The seventh lane 
from the right is a molecular size standard in which the darkest band is 1.6 kb in 
size; successive bands above it are approximately 2, 3, 4, 5, 6, 7, 8, ... kb, and the 
band below it is 1.0 kb. Five EcoRI patterns (A-E) are evident, their interrelation- 
ships summarized in a parsimony network shown under the radiograph (arrows 
indicate direction of restriction site loss, not necessarily the direction of evolution). 
For example, pattern A differs from B by loss of an EcoRI restriction site, which 
converts the 4.6-kb and 8.0-kb fragments in the B profile to the 12.6-kb fragment 
in A. In turn, C differs from B by gain of an EcoRI site, which converts B's 8.0-kb 
fragment to C's fragments of sizes 5.1 kb and 2.9 kb. Pattern E apparently has a 


"doublet" (two fragments of indistinguishable molecular weight) at 3.1 kb. (From 
Avise 1987.) 


tion survey revealed about 50-100 restriction fragments per individual. 
Because the restriction enzymes employed were commonly five- and six-base 
cutters, these results were equivalent in information content to assaying 
250—600 bp of recognition sequence per specimen. The larger mtDNA popu- 
lation surveys sometimes included many hundreds of individuals. 

Most differences among mtDNA digestion profiles proved to arise from 
point mutations that had created or destroyed enzyme recognition 
sequences. Often, the number of mutations distinguishing particular diges- 
tion profiles could be deduced directly from single-enzyme gel patterns, 
using information about mtDNA fragment sizes (see Figure 3.9). Such data 
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could be accumulated across restriction enzymes and used to generate com- 
posite mtDNA haplotype descriptions. Data were also recorded as binary 
characters and summarized in presence-absence matrices of restriction sites 
across individuals or mtDNA clones (Box 3.1). Some studies went further by 
mapping the positions of sites relative to one another or to landmarks on the 
mtDNA genome, using double-digestion or partial-digestion procedures 
(Figure 3.10). Normaily, however, sites could be mapped only to within a 
few tens or hundreds of base pairs of their true locations. 

Some mtDNA RFLPs were sufficiently complex that mutational path- 
ways distinguishing haplotypes were difficult or impossible to deduce from 
the gel patterns alone. Researchers could nonetheless count percentages of 
shared (and presumably homologous) fragments, but caution was indicated 
because, unlike site changes, not all fragment changes are independent. For 
example, a point mutation that creates a restriction site results in the corre- 
lated appearance of two smaller fragments with same total molecular weight 
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Figure 3.10 Scoring and mapping of mtDNA restriction sites. Shown are 
mtDNA haplotypes A, B, and C as revealed by digestion by restriction endonucle- 
ases a, b, and c. The restriction maps at the top were unknown at the outset, but 
were deduced from observed gel profiles produced in single- and double-enzyme 
digests. Fragment sizes (in kb) are indicated. A parsimony network at the bottom 
summarizes the likely pathway of evolutionary interconversion between haplo- 
types with respect to these assayed sites. Note in this case that restriction site 
changes and the parsimony network (but not the full restriction site map) could 
also have been deduced directly from single-enzyme digestion profiles. 
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as the one larger fragment lost (see Figure 3.9). Statistical methods were 
developed to take into account such correlated fragment changes in convert- 
ing percentages of shared fragments to estimates of sequence divergence 
(e.g., Nei and Li 1979; Upholt 1977). Additional sources of RFLP variation 
stemmed from occasional differences in mtDNA size, most often due to vari- 
ation in copy number of localized tandem repeats in or near the control 
region of the molecule (Bermingham et al. 1986; Harrison et al. 1985; Moritz 
and Brown 1986, 1987). These localized repeat regions ranged in size from a 
few base pairs to more than 1 kb, and the larger ones were readily distin- 
guished from restriction site changes because they concordantly altered the 
total lengths of restriction fragments in all digestion profiles (smaller frag- 
ment size differences could be overlooked, however, particularly when they 
resided in high-molecular-weight gel bands). 

With the advent of PCR-mediated DNA sequencing (described below), 
the laboratory methods for analyzing mtDNA have changed, but the gener- 
al nature of the information provided (on matrilines) and the classes of bio- 
logical problems that can be addressed (especially at the intraspecific level) 
remain much the same. Direct sequencing has also expanded opportunities 
for delivering mtDNA data in a form suitable for phylogenetic reconstruc- 


BOX 3.1 Restriction Site Matrix 





The example shown here involves 96 restriction sites (0, absent; +, present) in 10 differ- 
ent mtDNA haplotypes (a-e and p-t) in sharp-tailed sparrows, Ammodramus caudacutus. 
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Source: From a larger data set in Rising and Avise 1993. 
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tion at supraspecific levels. Assays of animal mtDNA remain among the 
most powerful and popular of all molecular approaches in ecology and evo- 
lution (Randi 2000). 


Plant organelle DNA 


Several evolutionary features of animal mtDNA were completely unantici- 
pated, among them the predominance of a single mtDNA sequence (homo- 
plasmy) within single specimens despite extensive between-individual 
sequence differences, and the rapid pace of nucleotide substitution despite 
what would seem to be severe functional constraints on mtDNA, as judged 
by the molecule's "genetic economy" (Attardi 1985). However, after the 
major evolutionary features of animal mtDNA were revealed, it might have 
been supposed that these attributes would apply to other cytoplasmic 
genomes as well. Surprisingly, this did not prove to be the case. 


PLANT MITOCHONDRIAL DNA. Plant mtDNA is highly variable in size, 
ranging from about 200 kb to 2,500 kb across species (Palmer 1985; Pring 
and Lonsdale 1985; Ward et al. 1981). Within an individual, mtDNA 
sequences typically exist as a heterogeneous collection of circles arising 
from extensive recombination that interconverts between sub-genomes and 
higher-order multimers (Backert et al. 1996, 1997; Hanson and Folkerts 1992; 
Palmer and Herbon 1986; Palmer and Shields 1984). Inheritance is often, but 
not invariably, maternal (Birky 1978; Forsthoefel et al. 1992; Havey et al. 
1998; Kondo et al. 1998). Although plant mtDNA is generally similar to ani- 
mal mtDNA with regard to gene content and general function, its evolu- 
tionary pattern differs diametrically (Birky 1988; Palmer 1992) Plant 
mtDNA evolves rapidly with respect to gene order, but about a hundredfold 
more slowly than animal mtDNA with respect to nucleotide sequence 
(Palmer and Herbon 1988). These properties, as well as the technical diffi- 
culties of laboratory assay, have conspired to limit the utility of plant 
mtDNA in molecular systematics and population biology (but see 
Desplanque et al. 2000; Huang et al. 2001; Olson and McCauley 2002). 


CHLOROPLAST DNA. Plant chloroplast DNA oífers yet another story, as 
emphasized in an influential early review by Palmer (1985). It is transmitted 
maternally in most species (Birky 1978; Gillham 1978; Hachtel 1980; Havey 
et al. 1998), biparentally in some (e.g., Harris and Ingram 1991; Metzlaff et 
al. 1981; Shore and Triassi 1998), and paternally in various others (Chat et al. 
1999; Yang et al. 2000), including most gymnosperms (e.g., Dong et al. 1992; 
Kondo et al. 1998; Sperisen et al. 2001; Szmidt et al. 1987; Wagner et al. 1987). 
This circular molecule (Figure 3.8B) varies greatly in size among species 
(from approximately 120 to 217 kb in photosynthetic land plants, for exam- 
ple), due largely to the extent of reiteration of a large inverted repeat that 
includes genes for rRNA subunits (Zurawski and Clegg 1987). With some 
possible exceptions (Milligan et al. 1989; Wagner et al. 1987), the rate of 
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cpDNA evolution is slow, in terms of both primary nucleotide sequence 
(mean silent substitution rates have been estimated at only three to four 
j times greater than those of plant mtDNA; Wolfe et al. 1987) and gene 
rearrangement (Curtis and Clegg 1984; dePamphilis and Palmer 1989; 
Palmer 1990; Palmer and Thompson 1981; Ritland and Clegg 1987). 
Seminal studies on cpDNA involved RFLP analyses (Palmer and Zamir 
1982), but in later years most molecular analyses entailed direct sequencing 
of particular cpDNA genes. Because of cpDNA's leisurely pace of evolution, 
such data have proved especially valuable for estimating plant phylogeny 
at higher taxonomic levels (e.g., Clegg et al. 1986; Palmer 1987; Palmer et al. 
1988a; Zurawski and Clegg 1987). Several unique structural features of 
cpDNA (see Chapter 8) have further contributed to the identification of var- 
| ious plant clades (e.g., Downie and Palmer 1992; Jansen and Palmer 1987). 
Particular cpDNA sequences have also been tapped as a valuable source of 
information on intraspecific phylogeography (see Chapter 6). Interestingly, 
the first restriction site appraisals of cpDNA (Atchison et al. 1976; Vedel et 
al 1976) were contemporaneous with early studies on animal mtDNA 
RFLPs, and they sometimes uncovered at least modest variation within as 
well as among closely related plant species (Banks and Birky 1985). None- 
theless, phylogeographic studies of plants generally lagged far behind those 
of animals, a situation that only recently has become partially rectified (Petit 
and Vendramin 2003; Schaal et al. 2003). 


Single-copy nuclear DNA 


As applied to single-copy loci (scnDNA) or low-copy genes in the nucleus, 
RFLP analyses traditionally relied on Southern blotting (see Figure 3.6), with 
the probes representing DNA sequences cloned into a biological vector such 
as lambda phage or a bacterial plasmid (Kochert 1989; Figure 3.11). The 
probes were particular genes of known function or anonymous sequences 
| drawn at random from a genomic library (i.e., from a collection of cloned 
DNA fragments). 

Probes for known-function genes are often derived from "complemen- 
tary” DNA (cDNA); that is, sequences produced by reverse transcription of 
a particular messenger RNA. Anonymous single-copy probes can be gener- 
ated as follows: Total cell DNA is extracted and digested with a restriction 
enzyme. Fragments of size 500—5,000 bp are isolated by electrophorésis and 
cloned into a vector, thereby generating a DNA library. This library is then 
Screened for single-copy sequences by dot-blot hybridization (Figure 3.12), 
whereby each clone's DNA is hybridized under controlled conditions with 
radioactively labeled total cell DNA. The radioactive signal intensity of each 
dot in the blot is then assessed: A strong signal indicates clones carrying 
| repetitive DNA, and a weak signal may identify clones with low-copy 
sequence. One important distinction between cDNA and genomic clones as 
probes is that the former represent processed coding sequences for a tran- 
scribed gene product, whereas the latter can also include gene-flanking 
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Figure3.11 General protocol for DNA cloning and genomic library construction. 


regions as well as introns (non-coding sequences interspersed with the 
exons that specify a gene's amino acid sequence). Some of these non-coding 
sequences evolve rapidly and provide additional classes of genetic markers 
(Friesen 2000). 

Later-developed methods for generating scnRFLPs took advantage of 
the polymerase chain reaction (PCR). For scnDNA identified in a nuclear 
genomic library, PCR primers were generated and used to amplify homol- 
ogous DNA from each individual. These amplified DNAs were then 
digested by restriction enzymes, electrophoresed, and chemically stained 
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Figure 3.12 General protocol for surveys of scnDNA RFLPs using the polymerase 
chain reaction. 


(Saperstein and Nickerson 1991). The process is summarized in Figure 
3.12. This PCR-based RFLP method offered some advantages over 
Southern blotting (Karl and Avise 1993): it requires only a small amount of 
template DNA; that DNA is of defined length (bounded by primers), so 
size differences underlying RFLPs can readily be distinguished from 
restriction site differences; PCR amplifies DNA in unmethylated condition, 
so natural DNA methylation is not a potential confounding source of vari- 
ation in restriction digests; and the method bypasses any need for radioac- 
tive isotopes and autoradiography. On the negative side, great effort is 
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entailed in constructing genomic libraries and screening them for low-copy 
sequences. 

Restriction analyses of scnDNA are intended to reveal scorable RFLPs 
at individual loci. Raw data are in many respects analogous to those pro- 
vided by allozymes: diploid specimens can be described as homozygous or 
heterozygous (Figure 3.13); the Mendelian nature of polymorphisms can 
be verified by pedigree studies or by agreement of genotypic frequencies 
with Hardy-Weinberg expectations; a population may exhibit multiple 
alleles at a locus; and genotypic descriptions can be accumulated across 
loci. In principle, two major advantages over protein electrophoresis are 
the nearly unlimited pool of genetic loci that might be tapped (thousands 
of low-copy regions exist in most genomes); and the fact that polymor- 
phisms include silent as well as replacement substitutions. In practice, 
however, these methods proved much more demanding than protein elec- 
trophoresis and were used infrequently in molecular ecology and evolu- 
tion, although they had considerable impact in related research areas such 
as mapping of disease genes and quantitative trait loci (Botstein et al. 1980; 
B. Martin et al. 1989; Paterson et al. 1988; Weller et al. 1988), and in breed- 
ing studies and strain verification in domestic species (Apuya et al. 1988; 
Beckmann et al. 1986). 
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Figure3.13 Interpretation of gel profiles for nuclear RFLP polymorphisms. 
Hlustrated is the method for diploid organisms, as assayed by Southern blotting 
using a scnDNA probe. Restriction site positions along a stretch of DNA are shown 
for each of two enzymes, with solid symbols indicating invariant restriction sites 
and open symbols indicating variable sites either present (1) or absent (0) in vari- 
ous chromosomes in the population. Heterozygous individuals in lanes of the gels 
are "0/1"; homozygotes are "0/0" and "1/1." Other numbers indicate sizes (in kb) 
of various restriction fragments. 





ea —Á 


es 


Molecular Techniques 


Moderately repetitive gene families 


When a DNA probe with homology to repetitive genomic sequences is used 
in Southern blotting, the probe hybridizes to all such sequences and simulta- 
neously reveals restriction profiles at multiple members of the gene family. 
For example, ribosomal RNA genes in the nuclei of eukaryotic cells usually 
exist as tandemly repeated elements, with each repeat unit composed of a 
highly conserved coding sequence with a total length of about 6 kb, plus 
shorter and more variable non-coding spacer regions (Figure 3.14). These 
rDNA modules may occur at one or several chromosomal sites, with the total 
number of rDNA copies per genome varying from several hundred in some 
mammals and insects to many thousands in plants (Long and Dawid 1980). 

The ready availability of probes for rRNA genes prompted many 
Southern blotting studies of population variation and differentiation in 
these genetic regions (Appels and Dvorak 1982; Arnold et al. 1991; Davis et 
al. 1990; Rieseberg et al. 1990a,b; Rogers et al. 1986; Saghai-Maroof et al. 
1984; Schaal et al. 1987; Williams et al. 1985). These studies revealed RFLP 
markers that often distinguished related species and sometimes conspecific 
populations. Most of the genetic differences involved varying lengths of the 
repeat unit due to heterogeneity in the size of the spacer regions (see Figure 
3.14), with additional variation occasionally reflecting restriction site 
changes in both the coding and spacer regions (Schaal 1985). 

A central difficulty in interpreting genetic markers provided by any 
multi-gene family lies in understanding the degree to which concerted evo- 
lution may have homogenized the repeated DNA sequences (Arnheim 1983; 
Ohta 1980; Ohta and Dover 1983). From one perspective, an ideal situation 
would be concerted evolution that was so pronounced that all copies of a 
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Figure 3.14 Structural features of rDNA repeat modules. Drawn to approximate 
scale are representative structures in a bacterium, plant, and animal. Shaded 
regions indicate loci encoding small (165 and 18S) and large (23S, 26S, 28S) sub- 
units of ribosomal RNA, as well as 5S rRNA elements. Black regions indicate 
internal transcribed spacers, which often differ in length. (After Appels and 
Honeycutt 1986.) 
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repeat within each individual or local population were quickly homoge- 
nized, such that each specimen carried a single, unambiguous genotype. On 
the other hand, extreme concerted evolution of this sort would also, in effect, 
confine the information content in a family of sequences to that of merely one 
gene. Conversely, a paucity or absence of concerted evolution would mean 
that a gene family carries multi-locus phylogenetic information, but it would 
also make that information more difficult to retrieve and interpret, due ulti- 
mately to the complications of distinguishing orthology from paralogy (see 
Chapter 1). Empirically, Williams et al. (1987) showed that rDNA variants in 
Drosophila have a nonrandom distribution between X and Y chromosomes, 
suggesting in that case that concerted evolution within chromosomes is more 
pronounced than concerted evolution between them. In another such analy- 
sis, Arnold et al. (1988) showed that biased gene conversion had influenced 
distributions of rDNA sequences in a grasshopper hybrid zone. Thus, the use 
of nuclear rRNA genes (and other repetitive DNA families) as genetic mark- 
ers in microevolutionary studies can be problematic (Schaal et al. 1991). On 
the other hand, Hamby and Zimmer (1992) concluded that “the most 
remarkable feature of rDNA is the overall sequence homogeneity among 
members of the gene family." Concerted evolution, plus a slow pace of 
sequence divergence, have in fact been crucial to the widespread use of var- 
ious rDNA sequences in higher-level systematics (see Chapter 8). 

Despite these potential complications, RFLP markers from rDNAs have 
contributed to studies of geographic population structure and patterns of 
introgression in hybrid zones (Arnold et al. 1987; Baker et al. 1989; Cutler et 
al. 1991; Learn and Schaal 1987). Southern blotting procedures have also 
been employed to assess levels of genetic variability in other multi-gene 
families (e.g., Gibbs et al. 1991). For example, the major histocompatibility 
complex (MHC) is a family of tightly linked loci that encodes cell surface 
antigens involved in immunological responses (Edwards et al. 2000). In 
many mammals, particular MHC genes are known to be highly polymor- 
phic, some with scores of alleles (Hedrick et al. 1991; Hughes and Nei 1988, 
1989; Klein 1986). In one early example of MHC's utility in microevolution- 
ary studies, feline probes homologous to one class of MHC loci were 
employed to assess molecular variation in two cat populations (African 
cheetahs and Asiatic lions; Winkler et al. 1989; Yuhki and O'Brien 1990) sus- 
pected by other criteria to possess low genome-wide variability due to his- 
torical bottlenecks in population size (see Chapter 9). 

In recent years, RFLP approaches applied to moderately repetitive 
nuclear gene families (like those applied to mtDNA, cpDNA, and scnDNA) 
have mostly been supplanted by more efficient methods of direct DNA 
sequencing. 


Minisatellites and DNA fingerprinting 


The genetic complexity inherent in repetitive DNA families sometimes can 
be turned to advantage in terms of providing individual-specific genetic 
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markers. The term "DNA fingerprinting" usually is associated with a molec- 
ular approach introduced by Jeffreys et al. (1985a), in which Southern blot 
assays of hypervariable DNA regions reveal complex gel-band profiles that 
distinguish most or all individuals (barring monozygotic twins) within a 
sexually reproducing species (Bruford et al. 1992). The DNA probes origi- 
nally employed by Jeffreys (1987) hybridize to conserved core sequences 
(10-15 bp long) scattered in numerous arrays about the human genome as 
part of a system of “dispersed tandem repeats,” also referred to as min- 
isatellite loci or VNTRs (variable number of tandem repeats; Figure 3.15). 
Each repetitive unit within an array is about 10-70 bp long. Increases and 
decreases in the lengths of particular arrays often result from changes in tan- 
dem repeat copy number arising from high rates of unequal crossing over 
during meiosis (indeed, minisatellite sequences may be genomic hotspots 
for recombination; Jarman and Wells 1989). 

Jeffreys's original probes were isolated from a human myoglobin intron 
and were applied to problems in human forensics (Dodd 1985; Gill et al. 1985; 
Jeffreys et al. 1985b,c), but these probes soon were shown to cross-hybridize 
to reveal DNA profiles in other mammals (Hill 1987; Jeffreys and Morton 
1987; Jeffreys et al. 1987), as well as birds (Brock and White 1991; Burke and 
Bruford 1987; Hanotte et al. 19922; Meng et al. 1990), fishes (Baker et al. 1992), 
and even some invertebrates, such as corals and snails (Coffroth et al. 1992; 
Jarne et al. 1990, 1992). Probes for additional hypervariable minisatellites were 
then identified (such as one from M13 phage) that behaved similarly in pro- 
viding complex DNA fingerprints in various vertebrate taxa (Georges et al. 
1987; Longmire et al. 1990, 1992; Vassart et al. 1987), invertebrate animals (Zeh 
et al. 1992), plants (Rogstad et al. 1988), and microbes (Ryskov et al. 1988). 

The complex gel profiles characteristic of multi-locus DNA fingerprints 
appear when a restriction enzyme is employed that cleaves DNA outside 
the tandem repeat arrays (see Figure 3.15). From each homologous chromo- 
some position in an individual, either one or two DNA gel bands is 
revealed, depending on whether the specimen is homozygous or heterozy- 
gous with respect to the number of tandem repeats in that array. The pres- 
ence of several such arrays scattered about the genome results in composite 
digestion profiles typically consisting of 20 or more scorable bands per indi- 
vidual in the 4—25-kb size range. In many animal populations, including 
humans, dozens of alleles of different lengths may segregate at each chro- 
mosomal position. The multi-allelic plus multi-locus nature of the data 
result in elaborate, individual-specific gel profiles that are powerful in 
revealing genetic identity versus non-identity. They are also useful in assess- 
ing genetic parentage because, barring spontaneous de novo mutation 
(Jeffreys et al. 1988b), each band in an individual's DNA fingerprint derives 
from either its biological mother or father (see Chapter 5). 

The genetic complexity that makes multi-locus DNA fingerprints advan- 
tageous for distinguishing individuals becomes a liability in other contexts. 
Generally, it remains unknown which bands in a fingerprint belong to which 
locus (array), so whether individuals are homozygous or heterozygous at 
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Figure3.15 DNA fingerprinting using VNTR loci. Shown at the top of the figure 
are six genomically dispersed DNA regions (on the same or different chromo- 
somes), each of which in a population harbors variable numbers of tandem repeat 
elements. Solid circles within repeat units indicate a conserved core sequence to 
which a probe hybridizes in a Southern blot. A restriction enzyme that cuts DNA 
(at positions shown by arrows, outside each repeat array) reveals a complex diges- 
tion profile for individual A on a gel autoradiograph (bottom part of the figure). 
Other individuals (e.g., B and C) are likely to have different DNA fingerprints 
when similarly assayed. 
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particular loci seldom is ascertained, nor can allelic or genotypic frequencies 
in a population be specified. These problems seriously compromise attempts 
to estimate genetic relatedness or other population-level parameters (such as 
gene flow) from DNA fingerprints’ measurable attribute: the proportion of 
shared marker bands (Kuhnlein et al. 1990; Lynch 1988; Packer et al. 1991). 
These shortcomings of multi-locus DNA fingerprinting for applications 
other than genetic identity and parentage assessment thus prompted devel- 
opment of methods to analyze minisatellite loci one at a time (Carter 2000). 
These methods included use of refined DNA probes and more stringent 
hybridization conditions in Southern blotting, as well as PCR-based meth- 
ods (Horn et al. 1989), to reveal genetic variation tied to specific minisatel- 
lite arrays (Higgs et al. 1986; Jarman et al. 1986; Jeffreys et al. 1988a; 
Nakamura et al. 1987; Wong et al. 1986, 1987). Each VNTR locus displays 
just one or two bands in a DNA digestion profile, depending on whether the 
specimen assayed is homozygous or heterozygous for number of repeats (or 
other features) situated between the restriction sites. Another variant of this 
approach was to characterize patterns of base substitution (in addition to 
repeat copy number) within particular minisatellite loci (Jeffreys et al. 1990, 
1991). For a brief time, minisatellite assays were the method of choice for 
forensic practices in humans (Balazs et al. 1989; Budowle et al. 1991; see 
Chapter 5) and wildlife (Burke et al. 1991; Hanotte et al. 1991, 1992b), but 
| they too soon were largely supplanted, in this case by microsatellite assays 
(described below). 


Polymerase Chain Reaction 


| Invention of the polymerase chain reaction (PCR; Mullis and Faloona 1987; 
| Saiki et al. 1985, 1988) revolutionized not only molecular biology, but also 
the fields of organismal and population biology (Arnheim et al. 1990), by 
stimulating many powerful new approaches to genetic marker acquisition. 
Basically, PCR enables researchers to quickly amplify or clone, in a test tube, 
assayable quantities of almost any desired piece of DNA from almost any 
biological source. Technical descriptions of PCR are given by Birt and Baker 
(2000), Innis et al. (1990), Palumbi (1996a), and Palumbi et al. (1991). 

The PCR technique involves three main steps (Figure 3.16): denatura- 
tion of double-stranded DNA by heating; annealing of primers to sites 
flanking the region to be amplified; and primer extension, in which strands 
complementary to the region between flanking primers are synthesized 
| under the influence of a thermostable DNA polymerase (Taq). These three 
steps are repeated 20 or more consecutive times, all in automated thermo- 
cycler devices that are now standard apparatus in genetics laboratories. 
During each cycle of denature-anneal-extend, the target sequence roughly 
doubles in quantity, so it soon assumes overwhelming preponderance. This 
purified product can then be assayed by any of several molecular proce- 
dures to be described shortly (as well as by the various kinds of RFLP analy- 
ses already mentioned; Morales et al. 1993). 
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1. Isolate DNA 


2. Denature DNA, anneal 
primers 


3. Primer extension 


4. Denature DNA, anneal 
primers 


5. Primer extension 


6. Denature DNA, anneal 
primers 


Figure 3.16 General protocol of the polymerase chain reaction for amplifying 
DNA. (After Oste 1988.) 


7. Primer extension 








8. Repeat cycles 
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The primers employed to initiate any PCR reaction typically are short 
sequences (ca. 20-30 nucleotides long) that exhibit high sequence similarity 
(especially at the 3’ end) to regions flanking the target sequence. They nor- 
mally are generated in special machines that are programmed (based on a 
specific known DNA sequence) to synthesize particular oligonucleotides. 
Depending upon the primers employed, sequences amplified by PCR can be 
coding or non-coding DNA, and they can come from any location in nuclear 
or organelle genomes. Amplified sequences are normally a few hundred bp 
to about 1 kb in length, but “long-PCR” procedures have also been devel- 
oped (Cheng et al. 1994; Cohen 1994; Li et al. 1995) to amplify much larger 
fragments, such as the whole 16-kb animal mtDNA genome (Nelson et al. 
1996). 

Some PCR primers have wide taxonomic latitude, meaning that they 
successfully amplify not only target DNA from the species from which they 
were developed, but also homologous DNA across broader taxonomic 
groups. Other PCR primers work well only within a target species and per- 
haps among its close relatives. Following the introduction of PCR, vast 
research effort has gone into primer identification and development. 
Published compilations. of highly conserved primers flanking particular 
loci, such as various mtDNA genes in insects (Simon et al. 1994; Zhang and 
Hewitt 1997), fishes (Meyer 1994a; Normark et al. 1991), birds (Sorenson et 
i al. 19992), and even across broader taxonomic groups (Kocher et al. 1989) 
are available. At the other end of the spectrum are species-specific or genus- 
specific PCR primers, such as those typically employed to amplify 
microsatellite loci. In 2001, a new journal (Molecular Ecology Notes) was 
launched largely to accommodate papers reporting PCR primers and 
microsatellite assay conditions for hundreds of individual plant and animal 
species. 

Another great advantage of PCR over earlier approaches is that it 
enables recovery of assayable quantities of DNA from even tiny bits of bio- 
logical material, such as can be obtained from single feathers (Taberlet and 
Bouvet 1991), hairs (Morin et al. 1992; Taberlet and Bouvet 1992; Vigilant et 
al. 1989), dung (Fernando et al. 2003a; Kohn and Wayne 1997; Palomeres et 
al. 2002), dried or ethanol-preserved museum material (Higuchi et al, 1984), 
and even some fossils up to tens of thousands of years old (see Chapter 8). 
Table 3.1 describes some remarkable and even bizarre sources from which 
various DNA sequences have been successfully amplified and assayed. 

However, PCR-based approaches are not entirely free of technical diffi- 
culties. One issue is the degree of fidelity of PCR amplification (Dunning et 
al. 1988; Ennis et al. 1990; Páábo and Wilson 1988; Saiki et al. 1988). Any mis- 
incorporation of nucleotides, especially in early rounds of amplification, can 
result in an amplified sequence that differs at least slightly from the original 
template. Such low-frequency misincorporation has been observed, but its 
effects in most biological applications probably are negligible (Kwiatowski 
et al. 1991). A second potential difficulty, especially when amplifying loci 
from polyploids or members of multi-gene families, is PCR-mediated 
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TABLE 3.1 Examples of some remarkable biological sources from which DNA 





Source 


has been successfully amplified using PCR 


Biological context 





Sloughed skin 

Plucked or shed hair 
Saliva 

Egg shells and membranes 
Soil or clay (bound DNA) 
Subterranean groundwater 
Otoliths ("ear stones") 
Old scales 

Old baleen 

Dried blood, semen 
Arthropod blood meals 
Extracted blood 

Retail caviar or meat 
Stomach contents 
Stomach contents 
Regurgitated owl pellets 
Feces 

Feces 

Feces 

Feces 

Carcasses 

Single diploid cells 
Single eggs, embryos 
Single larvae 

Single sperm cells 

Fossils 

Fine plant roots 

Pollen on arthropods 
Dry wood 

Ancient wood 


Sex and individual identification in whales 
Individual identification and parentage analysis 
Forensic identification of crime suspects 

Forensic identification in birds 

Detection of microbes or other remains 

Detection of bacteria 

Population genetic structure in fishes 

Temporal population genetics in fishes 

Temporal population genetics in whales 

Forensic identification of crime suspects 
Identification of mosquito hosts 

Detection of malaria parasites in birds and lizards 
Species identification of illegal wildlife products 
Identification of species consumed 

Paternity assessment of cannibalized offspring 
Dietary analysis for studies of mammal abundance 
Individual identification of excrement producer 
Species identification of excrement producer 

Sex (etc.) identification of excrement producer 
Foods consumed by excrement producer 
Individual identification in an endangered species 
Genome analysis, genetic aberrations, cancers 
Identification of marine organisms 

Identification of marine organisms 

Genetic mapping and other medical applications 
Phylogeography and phylogeny estimation 
Genetic assignment to tree species 

Identification of orchids pollinated by insects 
Tree identification 

Tree identification 


recombination among the amplification products (Cronn et al. 2002). Third, 
PCR reactions sometimes fail for a variety of reasons, thereby producing 
“null” alleles that can add complications in particular applications, such as 
genetic parentage assessments via microsatellite assays. However, the most 
serious challenges posed by PCR probably stem directly from what is also 
one of the technique's greatest blessings: its extreme sensitivity. Thus, sam- 
ple contamination (by microbes, physical handling, or any source of contact 
with even minuscule amounts of foreign DNA) can sometimes result in 
amplification of molecules other than those intended, and this can occa- 
sionally result in serious interpretive errors (see Chapter 8). 
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RAPDs 


Any PCR reaction per se is merely prelude to some form of DNA assay. One 
well-known example is a methodology to reveal RAPDs (pronounced 
"rapids," short for "randomly amplified polymorphic DNAs"). This technique 
involves the use of short (ca. 10 bp) PCR primers of arbitrary sequence to 
amplify anonymous genomic sequences. When the resulting products are 
separated in suitable electrophoretic gels (Welsh et al. 1991; Williams et al. 
1990), DNA-level polymorphisms (usually in primer recognition sites) can 
be uncovered. (Detailed laboratory methods are provided in Edwards 
1998.) The RAPD approach was widely employed in population biology, 
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especially during the 1990s (e.g, Hadrys et al 1992; Hedrick 1992; 
Rieseberg 1996), but it developed a reputation for poor reproducibility in 
some cases (e.g., Pérez et al. 1998), and it also has the disadvantage of nor- 
mally revealing only dominant markers (Figure 3.17). For these reasons, its 
use has generally waned, especially with the rise in popularity of 
microsatellite markers, which are also highly polymorphic, but provide 
more informative codominant markers. Nonetheless, RAPD techniques are 
still employed by many laboratories, as they do offer a quick and easy way 
of screening potential molecular markers from many loci (Ritland and 
Ritland 2000). 


STRs (microsatellites) 


PCR-based assays of "short tandem repeat” loci (STRs, or “microsatellites”) 
have become probably the most popular and powerful of the current meth- 
ods for identifying highly polymorphic Mendelian markers (Li et al. 2002; 
Scribner and Pearce 2000). Each microsatellite locus consists of reiterated 
short sequences (usually di-, tri-, or tetranucleotides) that are tandemly 
arrayed at a particular chromosomal location (Hamada et al. 1984), with 
variation in repeat copy number often underlying a profusion of distin- 
guishable alleles (sometimes 20 or more) within a population (see Figure 
2.6). Thus, a microsatellite array is reminiscent of a minisatellite array, 
except that each of its repeat units is much shorter. This means that alleles 
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Figure 3.17 Nature of data generated by RAPD procedures. In this approach, 
short random primers are employed to amplify anonymous DNA segments (in this 
case, from four unlinked gene regions) by PCR. The presence of a band indicates 
successful amplification; absence indicates unsuccessful amplification, perhaps 
resulting from naturally occurring mutations in the primer recognition site. Thus, 
RAPDs in diploid organisms usually behave in dominant/recessive fashion 
(although cases of codominant inheritance are known; e.g., Fritsch and Rieseberg 
1992). As exemplified in this figure, the RAPD approach can be especially useful in 
characterizing first-generation or later-generation hybrids between genetically dis- 
tinct species (here, S, and $,). 





Š 


gau —————————Ó——HÁ—M—r—íüÀ—— a a 


Molecular Techniques 


at STR loci can be distinguished in acrylamide (rather than agarose) gels, 
and without need for arbitrary binning (because even similar-sized alleles 
normally differ in readily detected increments of 2, 3, or 4 base pairs). 
Microsatellite loci were discovered in the late 1980s (Litt and Luty 1989; 
Tautz 1989; Weber and May 1989) and soon were shown to be characteristic 
features scattered abundantly throughout the nuclear genomes of most 
plants (Morgante et al. 1998; Nybom et al. 1992; Squirrell et al. 2003) and ani- 
mals (Ellegren 1991; Fries et al. 1990; Stallings et al. 1991), including humans 
(Valdés et al. 1993). They also occur in cytoplasmic genomes, including ani- 
mal mtDNA (Lunt et al. 1998), although the conventional term "STRs" is 
normally taken to imply nuclear markers. 

Microsatellite or STR variants are sometimes also referred to as simple- 
sequence length polymorphisms (SSLPs; Schlótterer et al. 1991). Techniques 
for their assay are outlined in Figure 3.18. The first and most laborious step 
is primer development, which necessitates constructing a genomic library 
for the target species, screening that library for clones that contain 
microsatellite repeats, sequencing the inserts from those positive clones, 
and using the information from unique sequences flanking each repeat 
region to synthesize PCR primers. Once primers are available, however, 
large numbers of individuals can be readily screened for Mendelian geno- 
types at specific STR loci displaying codominant alleles (Figure 3.19). 
Further details of laboratory protocols and analysis methods for 
microsatellites are given in Ciofi et al. (1998), Goldstein and Schlótterer 
(1999), and Zane et al. (2002). 

The discrete genotypic data provided by microsatellites are in several 
ways analogous to those provided by allozymes. However, population vari- 
ation is typically much higher at STR loci (compare Figures 2.1 and 2.6) due 
to the high mutation rate of microsatellites: often about 10? or 10+ per locus 
per gamete per generation (Primmer et al. 1996; Schug et al. 1997; Weber and 
Wong 1993). Indeed, de novo microsatellite mutations are occasionally 
uncovered in genetic parentage analyses (e.g., Jones et al. 1999a). Many, but 
not all, of these mutations result in alleles of the adjacent size classes, so that 
the mutational process tends to be imperfectly stepwise or ladderlike. Much 
discussion in the literature has centered on whether genetic distances based 
on microsatellite data should incorporate not only the frequencies of alleles, 
but also their size relationships to one another (e.g., Goldstein et al. 1995a; 
Nauta and Weising 1996; Ruzzante 1998). Another important point is that 
because of the rapid evolutionary pace of microsatellites and the underlying 
nature of their mutational processes, alleles that are identical in state (size) 
are not necessarily identical by descent (Angers and Bernatchez 1997; Estoup 
et al. 1995; Garza and Freimer 1996; Ortí et al. 1997b; van Oppen et al. 2000; 
see review in Estoup et al. 2002). This characteristic of microsatellites can 
often cause serious interpretive difficulties, especially in studies of geo- 
graphic population structure (Balloux and Lugon-Moulin 2002). 
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Figure 3.18 General protocol for microsatellite assays. 


AFLPs 


The technology for revealing “amplified fragment-length polymorphisms” 
(AFLPs; Vos et al. 1995) has elements of both PCR-based and RFLP-based 
methods. The laboratory protocols, described in Matthes et al. (1998), are 
rather complicated, but the basic outline is as follows. The goal is to selec- 
tively amplify a subset of restriction fragments from a complex mixture of 
fragments produced by digestion of genomic DNA by two restriction 
endonucleases, one with a 4-bp and one with a 6-bp recognition site. These 
fragments are linked to adapter sequences and biotinylated in such a way 
that subsequent PCR amplifies only a small subset of them, thus reducing 
what would otherwise be a hopelessly heterogeneous mix of different- 
sized genomic fragments to a manageable level. Even so, the resulting band 
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Figure 3.19 Extensive polymorphism at a microsatellite locus. Shown is an 
autoradiograph of a gel displaying more than 50 alleles in a local population of 
deer mice (unpublished data, Avise laboratory). All individuals shown are het- 
erozygous (each displays two primary bands). Specimens were purposely arranged 
from left to right such that one of the two alleles in each contributes to a steplike 
series of consecutive-sized alleles. Actually, most microsatellite assays today 
involve the use of fluorescent dyes and computer gel scans to score alleles and 
genotypes. Several microsatellite loci can sometimes be "multiplexed" and run on 
the same gel, with their alleles distinguished by use of a different fluorescent dye 
for each. 


| profiles on polyacrylamide gels are complex, reminiscent of those in multi- 
locus minisatellite DNA fingerprints. Polymorphisms are then scored as 
differences in the lengths of amplified fragments, which may be due to base 
substitutions in or near the restriction sites or to sequence insertions or 
deletions. Unfortunately, most of these molecular polymorphisms, like 
RAPDs, display genetic dominance. Nonetheless, the APLF approach has 
found application in various genomic and forensic analyses that require 
large numbers of qualitative, mostly unlinked polymorphisms (Mueller 
and Wolfenbarger 1999). When the dominant markers can be suitably ana- 
lyzed, this technique (like RAPDs) can also find application in estimating 
pairwise relatedness between individuals from large numbers of locí 
(Hardy 2003). 


| SINEs 


"Short interspersed elements" (SINEs) have proved to be superb phyloge- 
netic markers for identifying monophyletic groups (clades). Typically, each 
SINE is a tRNA-derived retropseudogene (Nei and Kumar 2000) residing at 
a specific chromosomal location. Large numbers of SINEs are dispersed 
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Figure 3.20 Methodology and logic underlying the use of SINEs for phyloge- 
netic purposes. (A) A SINE flanked by unique genomic sequences from which PCR 
primers (represented by arrows) are generated. (B) Diagram of electrophoretic gels 
showing PCR products in which two different SINE loci (L1 and L2) are assayed 
for presence (*) versus absence (-) of the expected fragment size in four different 
host taxa (gel lanes 1—4). (C) Phylogenetic summary corresponding to the results in 
B. Taxa 1, 2, and 3 apparently share a common ancestor that had the SINE L2 
inserted into its genome, and taxa 1 and 2 share a common ancestor that must have 
later acquired the SINE L1 insertion. (After Shedlock and Okada 2000.) 


throughout the genornes of eukaryotic organisms, but each occupied site 
presumably represents a single evolutionary insertion event, The trick is to 
design PCR primer pairs specific to the unique flanking regions of a given 
SINE, then repeat the process for each of many independent SINEs. This 
battery of primers is then employed in PCR amplification applied to 
genomic DNA isolated from species of interest. The PCR products are elec- 
trophoresed, and each taxon is then scored for presence versus absence of 
each SINE element (Figure 3.20). 

Taxa that prove to share even one or a few independent SINEs almost 
certainly belong to a clade because the presence of a given SINE probably 
signals a single evolutionary acquisition. SINEs are stable once inserted, so 
SINE absence at a given chromosomal site is presumably the original ances- 
tral condition (see Figure 3.20). Accordingly, a strong case has been made 
that SINEs are among the most powerful of all molecular markers for phy- 
logenetics (Shedlock and Okada 2000). Although the SINE approach has 
been employed in relatively few laboratories, it has proved to be extremely 
useful for phylogenetic reconstruction at diverse evolutionary time frames 
in several taxonomic groups, such as fishes (Hamada et al. 1998; Murata et 
al. 1996; Takahashi et al. 1998) and mammals (Nikaido et al. 1999, 2003; 
Shimamura et al. 1997). 
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SSCPs 


Especially in population genetic applications, the expense and effort of fully 
sequencing PCR products (see next section) for large numbers of individu- 
als is sometimes prohibitive, but shortcuts are available. One such approach, 
known as “single-strand conformational polymorphism” (SSCP; Orita et al. 
1989), takes advantage of the fact that single-stranded (denatured) DNA 
molecules a few hundred bp in length often assume different conformations 
even when differing by as little as one base pair. These distinctive confor- 
mations can be detected by electrophoresing PCR-amplified molecules 
through neutral polyacrylamide gels (Sunnucks et al. 2000). The SSCP 
approach is usually employed in conjunction with spot sequencing of rep- 
resentative PCR products so that conformational bands on SSCP gels can be 
related to particular full-length DNA sequences. The technique can also be 
used to separate haplotypes as a prelude to their further characterization by 
other techniques, such as by direct sequencing (Orti et al. 1997a). Another 
approach with similar potential applications is DGGE, or denaturing gradi- 
ent gel electrophoresis (Lessa 1993; Myers et al. 1986). Lessa and Applebaum 
(1993) review and compare these and other related shortcut screening meth- 
ods for detecting allelic variation in DNA sequences. 


SNPs 


Another commonly used acronym refers not so much to any particular assay 
approach, but rather to “single nucleotide polymorphisms,” regardless of 
how they are revealed (Brookes 1999). Several of the techniques described 
above (such as various RFLP and AFLP analyses) in effect often reveal SNPs, 
but not at the detailed molecular level, and not necessarily in distinction 
from small sequence insertions or deletions (indels). True SNPs are specific 
base-pair variants, and they are abundant in most genomes (approximately 
1.5 million are known in humans, for example; Kendal 2003). 

In principle, and sometimes in practice, SNPs provide a wellspring of 
molecular markers well suited to genomic analyses such as studies of link- 
age and linkage disequilibrium (Stephens et al. 2001a). In recent years, sev- 
eral laboratory approaches have been developed explicitly to screen large 
numbers of SNPs in well-characterized model genomes, such as those of 
yeast (Winzeler et al. 1998) and humans (See et al. 2000; D.G. Wang et al. 
1998). Some of these methods (Gilles et al. 1999; Syvánen 1999) incorporate 
microarray or microchip technologies (microchips are miniaturized holding 
platforms for nucleic acids), which have found broad application especially 
in meta-analyses of gene expression patterns. Because methods for massive 
screening of SNPs have not yet been widely developed and employed in eco- 
logical and evolutionary studies, they will not be elaborated upon here, but 
these technologies may be adapted for some such purposes in the future. 
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HAPSTRs and SNPSTRs 


A new approach, still mostly in the developmental phase, is to utilize a 
combination of molecular techniques that weds the strengths of different 
kinds of molecular markers. For example, STR loci are typically highly poly- 
morphic, but also suffer from extensive parallel and convergent evolution 
due to recurrent mutations to finite numbers of length-defined allelic states. 
By contrast, unique sequences that flank these microsatellites might provide 
a clearer genealogical signal, but they suffer the disadvantage (especially for 
microevolutionary reconstructions) of slow evolutionary rates. In an 
attempt to combine the advantages of clarity in phylogenetic signal and 
high polymorphism, Hey et al. (2004) introduced a laboratory protocol that 
involves separating STR alleles and their long flanking sequences by size, 
cutting these alleles from electrophoretic gels, and then sequencing flanking 
DNA to reveal individual haplotypes for genealogical appraisal The 
approach was nicknamed HAPSTR (for haplotypes at STR regions). A simi- 
lar protocol that combines SNP analysis at short autosomal sequences with 
STR assays was given the acronym SNPSTR (Mountain et al. 2002). 


DNA sequencing 


Two DNA sequencing methods have been available since the mid-1970s 
(Figure 3.21). One approach, introduced by Maxam and Gilbert (1977, 1980), 
relies on chemical cleavage reactions specific to individual nucleotides (A, T, 
C, or G). Ends of a targeted stretch of DNA are radioactively labeled, and the 
DNA is divided into four subsamples, which then are treated with different 
chemical reagents that cleave at base-specific positions. For example, one sub- 
sample is treated with dimethyl sulfate and piperidine, which results in DNA 
cleavage only at G positions. Reactions are carried out under conditions such 
that only a small, random fraction of sites is cleaved in any molecule, so that 
the composite digestion contains a collection of molecular fragments of vary- 
ing lengths, each terminated at a G position. The fragments are then separat- 
ed electrophoretically in a polyacrylamide gel and visualized by autoradiog- 
raphy. Parallel reactions specific for the other three bases are likewise carried 
out, and fragments are separated on adjacent lanes of a gel. Then, DNA 
sequence is read directly from ladderlike bands in the autoradiograph. 

A different sequencing procedure, introduced at the same time by 
Sanger et al. (1977), is the forerunner of techniques employed today. This 
method relies on controlled interruption of in vitro DNA replication (Figure 
3.21). Double-stranded DNA is denatured to single strands, and a short 
DNA segment (the primer) known to be complementary to a sequence in 
targeted DNA is annealed to that target sample. This prímer / template mix- 
ture is divided into four subsamples, each of which is subjected to a primer 
extension reaction catalyzed by DNA polymerase. Each reaction mixture 


FECE UMEN MEO 


PECIA 





Molecular Techniques 99 





Isolate DNA; denature — 5—....... GATCAGGCTTAAGCA......— 

into single strands Pes 

1. Radioactively end-label 1. Anneal primer 
5d GATCAGGCTTAAGCA............ -3 

ww -T 

2, Cleave chemically in four treatments 2. Add labeled dXTPs, DNA polymerase, 

and ddXTP in four treatments 
Cleave at: 
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Maxam-Gilbert approach Sanger approach 


Figure 3.21 General protocols for DNA sequencing. Shown are the Maxam-Gilbert 
(left panel) and Sanger (right panel) approaches. (After Hillis et al. 1990.) 


contains the four deoxynucleotides (dA, dC, dG, and dT), plus a single 
dideoxynucleotide (ddN), which is a nucleotide that lacks the 3’ OH group 
present in deoxynucleotides. The newly synthesized DNA strand is made 
radioactive, either by labeling the end of the primer or by incorporating a 
labeled deoxynucleotide during synthesis. DNA sequence extension occurs 
by attachment of nucleotides to a free 3’ OH, so that wherever ddNTP has 
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been incorporated into the growing strand, further extension is arrested. 
The polymerase reaction is carried out under conditions such that incorpo- 
ration of ddNTPs is rare and random. Thus, different DNA molecules in a 
subsample achieve varying lengths before termination at a particular base. 
As under the Maxam-Gilbert approach, fragments from the four subsam- 
ples are separated electrophoretically in a polyacrylamide gel and visual- 
ized by autoradiography, and the DNA sequence is read directly. 

Before PCR was invented, genes normally had to be laboriously cloned 
through biological vectors (see Figure 3.11) to make them suitable for direct 
sequencing. Otherwise the gene would merely be part of a complex mixture 
of many sorts of DNA sequences isolated from a given tissue, and its 
sequence would have been at insufficient concentration to prime the 
sequencing reactions. With PCR, both of these interrelated difficulties are 
overcome in one simple, straightforward, and fast set of DNA cycling con- 
ditions. Furthermore, amplification primers for PCR can be employed as 
primers in the sequencing reactions, allowing PCR and DNA sequencing to 
be directly coupled (methodological details are described in Hillis et al. 
1996; Volckaert et al. 1998). PCR and sequencing procedures have become 
increasingly automated and are now carried out routinely with PCR ther- 
mocyclers coupled to sequencing apparatus. The output typically consists of 
computer printouts of DNA sequences such as that shown in Figure 3.22. 

PCR-mediated DNA sequencing has made some (but certainly not all) 
earlier methods for generating molecular markers, especially for phyloge- 
netic purposes, nearly obsolete (e.g., Hillis et al. 1996; Miyamoto and 
Cracraft 1991). In recent years, DNA sequence information has increased 
explosively, such that by the beginning of 2003 nearly 25 million sequences, 
representing 115,000 taxa, already had been deposited in GenBank (a stan- 
dard computer repository for such information). 


CTTTTGTGTACCTTTGACTTTGACTGTGTGTGTGTGTGTGTGTGTGTGTGCGTGTGTGTGTGCGTGTGGAGGCATTGAC 
60 70 80 90 110 120 130 
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Figure3.22 Data output from a typical DNA sequencing reaction. Shown is a 
computer printout in which successive peaks in the graph (normally color-coded) 
describe bases (labeled along the top of the graph) at successive nucleotide posi- 
tions in one of the DNA strands. In this example, the sequence from positions 83 to 
118 is a microsatellite region (composed mostly of tandem copies of a GT repeat) 
flanked by unique sequences that could serve as a basis for developing microsatel- 
lite PCR primers. 
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Categorical Breakdowns of Molecular Methods 


In a sense, any assay method that stops short of obtaining DNA sequence 
can be thought of as providing an indirect and incomplete picture of genet- 
ic information at the loci screened. Indeed, nucleotide sequence data permit 
recovery of genetic information at less detailed levels (amino acid 
sequences, RFLP maps, etc.), whereas the converse is clearly not true (Box 
3.2). This is not to imply that DNA sequencing is always the preferred 
method for obtaining molecular markers suitable for a given research prob- 
lem in molecular ecology and evolution. For logistic reasons, DNA 
sequences usually are gathered from only one or a few genes in a given 
study and from relatively small numbers of individuals. Thus, sequence 
data provide high-resolution pictures of molecular differences (transitions 
versus transversions, synonymous versus non-synonymous substitutions, 
nucleotide substitutions versus insertions or deletions, changes in coding 
versus non-coding regions, etc.), but typically at the expense of sacrificing 
genetic information from a broad base of loci and large numbers of individ- 
uals. Furthermore, because single nucleotide positions can assume only four 
alternative states (A, T, C, or G), they are individually far more subject to 
phylogenetic noise (homoplasy; see below) than are some other kinds of 
molecular markers (such as SINE insertions). Researchers must weigh these 
and additional considerations when contemplating biological applications 
for molecular markers (Zhang and Hewitt 2003). 


BOX 3.2 The Nature of Nucleotide Sequence Data 


The tables shown on the next two pages compare DNA sequences from the 
mitochondrial cytochrome b gene in 14 samples (A-N) from several marine tur- 
tle species. The full data set (Bowen et al. 1993a) involved more than 500 
nucleotide positions per sample, but only 63 positions are shown here. Dashes 
indicate a state identical to that of the reference sample "a." 

Any of these coded data sets could be analyzed phylogenetically by * 
appropriate computer programs. Note that when data are coded as purines 
versus pyrimidines (2), only transversions are-counted in the resulting phylo- 
genetic estimates; and when data are coded. as amino acid sequences (3), only 
replacement substitutions are counted, These latter two treatments exemplify 
a trade-off common to most sequencing studies: Although they weight heavily 
for mutational events that are rare, and thus are less likely to show homoplasy 
over short evolutionary time, much information of potential phylogenetic sig- 
nificance (especially at lower sequence divergence levels) is lost by neglecting 
silent transitions. 
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Partial DNA and protein sequence data for mitochondrial cytochrome b 
in 14 marine turtle samples ? 


(1) Nucleotide sequences 





a ACC GGA ATC TTC TTG GCA ATA CAC TAT TCA 
Du. cese Mere, vee, CAS a Rd. rT Coe 
Ge eevee: E ON CAS exl. hore, erp 0E TO d 
d. wa eM OI Ge AY ize nk We SOON. 
E. CENT On US Be ues (QUA. en eee, UT, See Cele 
fae am GE. ne CORA uq item RIS vC. ghee 
WA Wedpb Ll Sis, cete DC RAD ure , esum. ceo oo C eee 
DLE mcn cup Lees CIAR LS xc v LC AMA» 
D oanex-e c omne. eT. ODIAW SE. Ier Kr CN 
ho Deseo ras Seige Bie TCA Sm CN AES ok: Sa 
kém. LUI fe gene TOMA NE ees RIT CH 
e te uno PEE UE NOXIUM. A Nous 
m PEE Jo RE DUNS ER p E RELIER A Era d ce ee 
nomeu. eo gee E RA eto UM ieee I cog 


(2) Same data coded as purines ("0") versus pyrimidines (“1”) E: 


a 011 000 011 111 110 2010 010 101 101 110 
b ELA qtu us is, t Ern it 
" IR C UE COR pr uM REY NET occ. RE 
dui «e 1e ccr TAE €US Ee wc aeo 
e --- a coe ems fnt. --- ene Ste: a Ee m -T-- -em < oe 
f --- --- T 2 =--> eee T eee M eee R 
EDI a EM Mam. cL. a L ee 
Ru. BN wo E E o T Ne Pe ee ee 
i C MEE E MEME: c E N na sa d 
Dé ue doloe cse etus cei S poH 4 
k P fle E Ee Rig Gee OP SCIO CS REX EAT oC s 4 
Ld GR AE K EL NOR AE tia Gea A UE 1 
NY gee Soe REI MENTEM ee cr CRM 
n d es TE. Es E m E ut ae 3425 14 LY 


(3) Same data translated into amino acid sequences (by reference to the 
mitochondrial genetic code) 


a thr gly ile phe leu ala met his tyr ser 
b ~ — = — — — — => = =% 
c = ans o. ee - La Dens ES = Da. 
d ^ = pos = 2- ES n dac = = 
e = = ids A3 = e = E — e 
f — — val — — — — — — — 
g = — —. — — — — — — — — 
h — — — — — = — — — — i 
i ey = T E er an im 4 a HT aN 
j m a CS m. = = CES zm = 2. 
k Et = = = -— S = Lg 
1 E ir p^ = = a ES ET : 
m — — — — — = — fi 
-n — — — — — Ill = — = ; 


Source: After Bowen et al. 1993a. 
"Dashes represent a character state identical to that of the reference sample “a.” E^ 
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110 001 01ł 112 110 (010 111 110 110 011 011 
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Fy foie aat ices. eee doe Ee nae San iS ee Se i ek a EA 
E JENNA Ar" LEUTE S D ST MEN Pe 
fe Tu d ag TF it usc oec AN ER. owed D 

le ee M CT RENE STE REN eee ak T HE TIRES SG es 

E R IG vas eee tO en ET ESTIS eg eS 

LEUR e uud Se CA, LT dn peau Doi ES UN are ae d 

vo eod Ka LO ate 5 Ei D Rares pU. e I LAE MAE, merat rris tee 

Lis ee eae ae a rw. CO et 2E A RR nap 5m 

^x WE Lr ttu Cow 3n d Vox dem nop dh MEN Bae ieee 

m d eeu PE, Soa Se Be m Cea Aen APR. E sce 

E pro asp thr ser leu ala phe ser ser ile „ile 
E. — — ije — met — — — — — thr 
E. — — ile => met — — — — — thr 
iy — — ile — met — — — — -< ser 
E — -— ile — met - -— -— -— = ser 
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One initial way to organize our thinking about alternative approaches 
for acquiring genetic markers is to subdivide the diverse types of molecular 
data into broad categories for which shared philosophical approaches to 
analysis and interpretation may pertain. The following are some of these 
partitions (Figure 3.23). 


Protein versus DNA information 


An obvious consideration is whether a molecular technique reveals varia- 

tion at the level of proteins or at the level of DNA. Methods in the former 

category primarily unmask genetic changes in coding regions that alter 

amino acid sequences. By contrast, the various nucleic acid techniques 

provide access to a far greater panoply of architectural changes in both 

coding and non-coding regions of the genome. Thus, informative markers 

can come from synonymous as well as non-synonymous (amino acid- - 
altering) nucleotide substitutions in protein-coding sequences, genetic 
changes in introns and gene-flanking regions, additions and deletions of 
genetic material, sequence rearrangements, and other such DNA-level 
features. 







DNA-DNA 
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DNA 
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Figure3.23 Alternative conceptual ways to “slice the pie" of representative 
classes of molecular genetic data. The heavy black line separates protein assays 
from those dealing directly with DNA. The heavy gray line divides the methods 
according to whether raw data consist of qualitative character states or distance 
values only. Lightly shaded slices of the pie indicate techniques that normally sup- 
ply information from only one gene (linkage group) at a time, as opposed to the 
remaining methods, which usually access genetic data from multiple loci simulta- 
neously (darkly shaded slices) or cumulatively (open slices). 
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Discrete versus distance data 


Some molecular techniques, notably protein immunology (via MCF) and 
: DNA-DNA hybridization, provide raw information solely in the form of 
3 numerical or quantitative distance estimates between taxa (Springer and 
| Krajewski 1989). Other molecular approaches provide raw data in various 
forms of qualitative character states (Table 3.2), such as electromorph alleles, 
restriction sites or fragments, PCR-amplified gel bands, or nucleotide 
sequences. Data from such discrete characters can be converted to quantita- 
tive estimates of genetic distance, if so desired, but the converse is not true; 
that is, discrete character states cannot be recovered from distance data alone. 
This distinction between qualitative and distance data is important for two 
reasons. First, many biological applications, such as forensic identification, 
parentage assessment, kinship appraisal, gene flow estimation, and charac- 
terization of hybrids, require qualitative genetic markers, whereas other 
applications, such as phylogeny estimation, can employ either discrete or dis- 
tance data. Second, several phylogenetic tree-building algorithms require dis- 
crete data, whereas others utilize matrices of genetic distances among taxa. 
The concept of genetic distance is fundamental to molecular systemat- 
ics. A genetic distance between two sequences, individuals, or taxa is a 
quantitative estimate of how divergent they are genetically. Units of dis- 
tance depend on the nature of the molecular information used. For example, 









at particular laciin &populátion: genetic context. 





E Comparisons among several molecular chniqu st for Rc me 
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Criterion AFLP RAPD STR nRFLP mtRFLP — Allozymes 


Number of loci Many Many Several Few "One" | Many 
typically assayed 

Many alleles No No Yes Yes Yes Yes 
often identifiable 
per locus? 

Replicability High Variable High High High High 
of assays 

Resolution High Moderate High High High Moderate 
of genetic 
differences 


Nature of markers Dominant Dominant Codominant Codominant Maternal Codominant 
(most often) 


Ease of use and Moderate Easy Difficult Difficult Moderate Easy 
development 

Laboratory Short Short Long Long Short Short 
development time 





Note: For similar but not always identical appraisals, see also Hillis et al. 1996; Karp and Edwards 1997; 
Mueller and Wolfenbarger 1999. 
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Nei's (1972) D for protein electrophoretic data is interpreted as the net num- 
ber of codon substitutions per locus that have accumulated between popu- 
lations since their separation, and its values can be either corrected or uncor- 
rected for presumed multiple substitutions ("hits") at particular amino acid 
sites. An analogous measure for DNA restriction site or sequence data is p, 
the estimated number of base substitutions per nucleotide site (or percent 
sequence divergence if uncorrected for multiple hits). Genetic distance val- 
ues (AT n) from DNA-DNA hybridization have no immediate molecular 
interpretation beyond measured difference in thermal stability per se, 
although independent information typically permits calibration of AT,, to 
magnitude of sequence divergence. Similarly, immunological distance (ID) 
values from micro-complement fixation merely describe antigen-antibody 
cross-reactivity (although with additional empirical information they can be 
calibrated to numbers of amino acid substitutions). Much debate has cen- 
tered on which particular statistical estimators of genetic distance are most 
appropriate for various classes of molecular information (e.g., Kalinowski 
2002; Tomiuk and Loeschcke 2003). Definitions of some standard distance 
measures for protein and DNA data are summarized in Box 3.3. 

The converse of genetic distance is genetic similarity (or the degree of 
genetic "identity"; Tomiuk et al. 1998). Thus, when genetic distance is low, 
genetic similarity is high, and vice versa. In the early literature of molecular 
evolution, it was customary to refer to "percent homology" between DNA 
sequences or other molecular characters under comparison. This practice is 
now discouraged. The word "homology" properly refers to organismal fea- 
tures (such as particular genes) that trace to a shared ancestral condition, so 
that sequences are either homologous or they are not. (Complications can 
arise, however, when a sequence includes regions of both homologous and 
non-homologous origin.) In any event, truly homologous sequences among 
an array of organisms can exhibit a wide range of genetic similarities, the 
values depending in large part on how long ago the extant taxa separated 
from common ancestors. 

Discrete character data are usually presented as a matrix that assigns a 
character state to each taxon for each character (e.g., Boxes 3.1 and 3.2). In 
an allozyme survey, for example, the gene for LDH could be considered a 
character, with its different states being the observed electromorphs. In a 
DNA sequence, a character might be a particular nucleotide position, with 
possible states A, T, C, or G. Thus, allozyme loci and nucleotide sites are 
examples of multi-state characters that can display three or more variable 
conditions. Such characters may also be defined at a more inclusive level, 
examples being a gene sequence with many alleles, or even an entire 
mtDNA genome that may exhibit a whole collection of different haplotypes 
in a given population. Binary characters, by contrast, are those described in 
such a way that they can assume only two states, such as presence versus 
absence of a RFLP restriction site at a particular map location, or purine ver- 
sus pyrimidine at a given nucleotide position. 
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Box3.3 Examples of Genetic Distance Statistics 





Allozymes 


For protein-electrophoretic data, the two distance measures that have been 
employed most commonly are Rogers's distance (Rogers 1972) and Nei's stan- 
dard genetic distance (Nei 1972, 1978). Several other such distance estimates for 
protein-electrophoretic data also have been proposed (see Nei 1987), but all tend 
to be highly correlated (although they may differ in absolute magnitude, espe- 
cially at larger values). . 


Rogers's distance. For a given locus, let x, and y;be frequencies of the ith allele 
in populations X and Y, respectively. Rogers's D is then defined as 


‘D =[0.5 Z (x; - y, ? (3.1) 


where the summation is over all alleles. When data from more than one gene are 
considered, the arithmetic mean of such values across loci becomes the overall 
genetic distance. Rogers's D can take values between zero and one. Rogers's 
genetic similarity (the converse of distance) is 5 = 1 - D. 


Nei's standard genetic distance. At any locus, Nei's genetic "identity" (similari- 


ty) is defined as . 
xy, 
lw ifi (3.2) 
( > xj Ly, yer 
and for multiple loci, the composite identity or similarity is 
I 
I2—ÀÓ (3.3) 
IE VN 


where J, J, and J,, are arithmetic means across loci of Ex}, Eyf, and Ixy, , 
respectively. Nei's I can assume values between zero and one. Standard genetic 
distance is calculated as 


D = -inl (3.4) 


Nei's D can range from zero to infinity, and its values are interpreted as mean 
numbers of codon substitutions per locus, corrected for multiple hits. 


DNA restriction fragments 

Upholt (1977) was the first to derive a relationship between the proportion of 
fragments shared in mtDNA digestion profiles and an estimate of nucleotide 
sequence divergence. Let N, and N, be the number of restriction fragments 
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observed in sequences X and Y, and Ny be the number of fragments shared by 
X and Y. The overall proportion of shared fragments is calculated as 


2N 


F=- (3.5 
(N,+N,) 


Then, an estimate of the number of base substitutions per nucleotide (or, approx- 
imately, the percentage of nucleotides substituted) is given by 


p=1-~(0.5[-F + (F*+8F)° Sp” | (3.6) 


where r is the number of base pairs in the enzymes’ recognition sites. Values of p 
are computed separately for enzymes recognizing four-, five-, and six-base 
sequences, and the final distance value is a weighted average of these estimates 
(weighted by total numbers of fragments produced by these respective enzyme 
classes). Using a different approach, Nei and Li (1979) derived a relationship 
between F and p essentially identical to that of Upholt. 


DNA restriction sites 


Let N, and N, be the number of restriction sites observed in sequences X and 
Y, and N,, be the number of sites shared by X and Y. The proportion of sites 


shared is 
S= Er " (3.7) 
TTN, +Ny) 
and the number of base substitutions per nucleotide is then estimated by 
either 
p=-InS/r (3.8) 
or 
p = -(3/2) In [(48'/9 — 1)/3] (3.9) 


Equation 3.8 treats original restriction sites restored by back mutations as new 
sites, whereas equation 3.9 considers reverted sites as identical to the origi- 
nals. Values of p again must be calculated separately for enzymes cleaving at 
four-, five-, and six-base recognition sites and the overall distance computed 
as a weighted mean. : 


Nucleotide sequences 

Only the simplest case will be considered, in which sequences of same length 
can be aligned without ambiguity. Let z; be the number of nucleotides that 
differ between two sequences, and z, be the total number of nucleotides com- 
pared. Percent sequence difference is then 


p=2,/% (3.10) 
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For sequences exhibiting little divergence, p is a close approximation to the 
accumulated number of nucleotide substitutions per site because no correc- 
tion is needed for multiple substitutions at a site. When sequence divergences 
are larger, corrections for multiple hits are more important. One simple correc- 
tion was provided by Jukes and Cantor (1969): 


D = -0.75 In (1 - 4/3p) (3.11) 


which assumes that substitutions at any nucleotide position occur with equal 
probability to any other nucleotide. Note that the maximum expected value of 
D under this equation is 0.75. Other corrections relax these assumptions. For 
example, the “two-parameter” model of Kimura (1980) does so by accommo- 
dating different probabilities for transitions versus transversions: 


D=0.5 In{4/(1 -2T - V] + 0.25 Inf1/(1 -2V)] (3.12) 


where T' and V are the observed pepe of transitions and transversions, 
respectively. Many additional considerations in deriving and interpreting dis- 
tance estimates for DNA sequences (and other molecular data) are detailed in 
Swofford et al. (1996). 

Note that unlike allozyme genetic distances that are based on population 
allele frequencies and provide distances between populations or species, the 
sequence divergence estimates just described apply. to separations between 
particular genes or alleles. If the sequences come from haploid individuals (as 
is effectively true for uniparentally inherited cytoplasmic genomes), calculated 
values can also be interpreted as between-individual distances (in that case, 
typically with respect to matrilines). When many such sequences within a pop- 
ulation are assayed, mean genetic distance (or nucleotide diversity; see Box 
2.1) is then estimated by 


mean p = Iff Pj (3.13) 


where f; and f, are frequencies of the ith and jth sequences in the population 
sample, and p, is the estimated sequence divergence between the ith and jth 
sequences (Nei 1987). Net sequence divergence (p, ) between populations can 
then be estimated by correcting for within-population polymorphism: 


- Pa = Pay ~ 0 5(p, * py) : \ s (3.14) 


where p,, is the mean pairwise genetic distance between individuals in popu- 
lation X versus population Y, and p, and p, are mean distances among indi- 
viduals within these populations. This correction assumes that sequence 
diversity within extant populations is similar to the magnitude of sequence 
variation present in the ancestral population from which they diverged 
(which may not always be true). 

Many computer programs are available to estimate genetic distances from 
sequence data, a fine example being MEGA (molecular evolutionary genetic 
analysis) software, recent versions of which include an expanded repertoire of 
distance estimation options that account for heterogeneity of substitution pat- 
terns when correcting for multiple hits (Kumar et al. 2000). 
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Multi-state characters may be unordered or ordered. Electromorphs at 
an allozyme locus are examples of unordered character states because their 
evolutionary interrelationships cannot be deduced directly from their 
observed electrophoretic mobilities. Similarly, alternative states at a given 
nucleotide position normally are considered unordered because there is no 
a priori reason to assume a particular evolutionary pathway for intercon- 
version among A, T, C, and G (although transitions in many cases may be 
more frequent than transversions, for example). On the other hand, mtDNA 
haplotypes usually occur as ordered multi-state characters because their 
probable evolutionary transformations can be deduced by reasonable crite- 
ria such as parsimony (see Figures 3.9 and 3.10). However, phylogenies esti- 
mated from such composite characters are evolutionary inferences 
(hypotheses) ultimately derived from information accumulated across 
lower-level character-state descriptions (restriction sites or nucleotide posi- 
tions, in these cases). Thus, character-state matrices for most computer- 
based phylogenetic algorithms consist of binary or multi-state data coded at 
these more fundamental levels (see Boxes 3.1 and 3.2). 

Polarized characters are those for which ancestral and descendant states 
have been determined. Thus, polarity refers to the direction of character-state 
evolution and goes beyond the concept of character order, which can be 
described even for non-polarized states. Properly rooting a phylogenetic 
tree is important in helping to establish character-state polarities. 


Detached versus connectable information 


Some types of molecular data can be readily connected across studies, oth- 
ers less so, or not at all. DNA sequences are good examples of connectable 
data. Once nucleotide sequences are available for any gene or species, newly 
obtained sequences can be compared against the originals without need to 
repeat earlier assays. Other kinds of information are impossible to link 
directly across studies. For example, a AT, value (from DNA-DNA 
hybridization) between genomes A and B is of no immediate service in 
assessing their relationships to genomes C and D, for which another AT m 
value might be available. 

The distinction between connectable and detached data is not the same 
as that between discrete versus distance information. For example, protein 
electrophoresis provides qualitative genotypic data, but the electromorphs 
themselves are distinguished by gel mobilities relative to one another. Thus, 
it is difficult to compare particular electromorph genotypes reported in one 
study with those of another (unless shared standards have been employed 
in both). Another point is that data comparability does not necessarily imply 
that the phylogenetic analyses it permits will be easy. Serious computation- 
al challenges arise in describing the vast combinational properties of con- 
nectable data. DNA sequencing assays, for example, have become so prolif- 
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ic that methods of data acquisition sometimes outstrip current systems of 
data management (Clegg and Zurawski 1992). This situation has given an 
urgency to the development of faster computer algorithms in comparative 
genomics and phylogenetics. 


Single-locus versus multi-iocus data 


As normally applied, some molecular techniques (such as immunological 
methods and DNA sequencing) entail data acquisition from individual loci, 
whereas others (such as DNA-DNA hybridization and multi-locus finger- 
printing) inherently access genetic information from multiple independent 
gene regions. This distinction is important because the amount and nature 
of genetic information influences the interpretations that can be drawn from 
data. An inherent advantage of DNA-DNA hybridization is that the method 
encapsulates information from multitudinous sequences. Similarly, the 
power of multi-locus DNA fingerprinting in forensic applications stems in 
large part from independent assortment among the dispersed polymorphic 
arrays from which the method captures information. 

For most applications of genetic markers, the number of functional 
genes assayed is less important than the number of linkage groups repre- 
sented, which influences how many independent bits of phylogenetic infor- 
mation are revealed. For example, animal mtDNA is usually composed of 37 
functional genes, but all of these are transmitted as a non-recombining unit 
primarily through female lines. Thus, from a phylogenetic perspective, the 
entire 16-kb mtDNA molecule is a single locus. 

Multi-locus assays can be categorized further into those that assess 
information from multiple loci simultaneously (e.g., DNA-DNA hybridiza- 
tion or multi-locus DNA fingerprinting) versus sequentially (e.g., multi- 
locus protein electrophoresis or assays of microsatellites). Only the latter 
type of assay normally provides information that is interpretable in simple 
genetic terms; that is, as Mendelian genotypes at particular loci. Although 
the number of genes included in an allozyme or microsatellite survey is typ- 
ically small or moderate, even a handful of interpretable genetic polymor- 
phisms, considered in aggregate, provide remarkable power in applications 
such as forensics, parentage assessment, gene flow estimation, and charac- 
terization of hybrids. 


Utility of data along.the phylogenetic hierarchy 


Another way to slice the molecular techniques pie is with regard to the level 
of evolutionary separation at which the various methods best apply (Box 
3.4). Most assays provide an empirical window of opportunity that is fairly 
narrow relative to the broad field of potential phylogenetic applications. For 
example, protein immunological and DNA-DNA hybridization methods 
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BOX 3.4 Representative Molecular Approaches and 
Phylogenetic Resolution 





The chart below indicates various levels of the evolutionary hierarchy at which a 
given molecular approach normally provides optimal phylogenetic resolution. 
More asterisks indicate higher suitability; dashes indicate that the technique is not 
particularly useful at that hierarchical level. However, few of these characteriza- 
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typically are suitable for phylogenetic studies at intermediate taxonomic 
levels, where species' separations may date to approximately 2-100 million 
years ago (mya). At the microevolutionary end of the continuum, DNA fin- 
gerprinting methods using mini- or microsatellites are powerful for indi- 
vidual forensics and parentage analysis. Studies of mtDNA (RFLPs or 
sequencing) have been highly fruitful at the levels of conspecific popula- 
tions and closely related species, as have allozyme surveys. Among the 
available molecular methods, only DNA sequencing can find application at 
virtually any taxonomic level. This flexibility stems from the fact that dif- 
ferent DNA sequences evolve at highly different rates, such that the choice 
of sequence to be examined can be tailored to each research question. 
Nonetheless, because of the labor and expense involved, obtaining DNA 
sequences from large numbers of individuals and large numbers of genes in 
a population context is not particularly cost-effective, and sequencing stud- 
ies usually are conducted at intermediate or higher levels of the phyloge- 
netic hierarchy. 

Of course, the volume of genetic data obtained also influences the reso- 
lution obtainable in a given application. For example, restriction site or ' 
sequencing studies of animal mtDNA at the population level commonly 
involved assays of about 500 bp per individual. At the conventional mam- 
malian sequence divergence rate of 2% per million years, roughly one in 500 
bp is expected to have changed after 100,000 years of matrilineal separation, 
thus establishing 100 millennia as an approximate lower limit of resolving 
power for de novo mutations with this level of effort. If the full 16,000-bp 
mtDNA genome were assayed in each individual, the ability to detect de 
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tions are absolute. For example, some protein-electrophoretic characters, such as 
presence versus absence of duplicate gene products, can be phylogenetically 
informative about deeper nodes in a phylogenetic tree, as noted in the text. 
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novo sequence divergence would increase by more than 30-fold, pushing a 
lower limit on resolving power to just a few thousand years. These state- 
ments refer strictly to the accumulation of novel mutations, and not to sig- 
nificant shifts in allele frequencies from preexisting ancestral polymor- 

| phisms (such shifts could take place in as little as one generation by genetic 
drift, natural selection, or migration). 


SUMMARY 


1. Many laboratory methods exist for revealing genetic markers. In rough order 
of historical appearance, the most important of these have been immunologi- 
cal methods (especially micro-complement fixation), multi-locus starch-gel 
electrophoresis (SGE) of proteins, and a succession of DNA-level approaches 
ranging from DNA-DNA hybridization to restriction assays to minisatellite 
DNA fingerprinting. More recently, numerous PCR-based approaches have 
been developed to reveal AFLPs, RAPDs, SSCPs, SINES, SNPs, STRs 
(microsatellites), and other polymorphic features of the genome. Direct DNA 
sequencing techniques have been available since the mid-1970s, but as they 
have become increasingly automated and coupled with PCR, they have vastly 
increased the ease with which great volumes of information can be obtained 
directly at the nucleotide level. 


2. The wide variety of laboratory methods available can be categorized for 
heuristic purposes into several different groupings according to the general 
nature of genealogical information each technique provides: protein-level ver- 
sus DNA-level data; discrete character states versus genetic distances only; 
information that is detached as opposed to readily connectable between sepa- 
rate studies; and single-locus versus multi-locus data. 
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3. No single molecular technique is ideally suited to all research endeavors. 
Rather, each assay method is fruitfully applied to specific tasks somewhere 
along a genealogical hierarchy. At the microevolutionary end of this hierarchy 
are analyses of clonality, genetic parentage, and close kinship. Then come 
assessments of extended intraspecific genealogy and phylogeography, specia- 
tion and hybridization, and finally, at the macroevolutionary end of the scale, 
phylogeny estimation at intermediate and deeper branches in the Tree of Life. 








4 


Philosophies and Methods 
of Molecular Data Analysis 





Thus the hereditary properties of any given organism could be characterized 
by a long number written in a four-digital system. 
G. Gamow (1954) 


Molecular markers lend themselves to a wide variety of data analysis methods, 
depending on the particular research problem being addressed. Many specific 
approaches and their applications (e.g., in gene flow estimation, genetic parentage 
analyses, cytonuclear dissections of hybrid zones) are described where relevant in 
later chapters, but one rather generic class of applications and its historical backdrop 
merits introduction here: phylogenetic analysis. This chapter presents thumbnail 
sketches of the history and some underlying principles of phylogenetic reconstruc- 
tion, brief descriptions of the primary techniques employed, and some key refer- 
ences to a vast literature on this topic. For a comprehensive and advanced overview 
of phylogenetic methods, I recommend Swofford et al. (1996), and for a "how-to" 
treatment with empirical examples and computer exercises, see Hall (2004). 


Cladistics versus Phenetics 


Until the mid-1900s, following the tradition of Linnaeus (1759), organisms were 
classified mostly by qualitative gestalt appraisals of morphology. Specialists typi- 
cally devoted years of study to a particular group, such as birds or beetles, and 
based on accumulated experience, classified their creatures into a hierarchical tax- 
onomy. This approach enabled scientists to organize and catalogue the otherwise 
bewildering diversity of the natural world, and the endeavor resulted in most of the 
biological classifications still followed today. Shortcomings of the approach 
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stemmed from the lack of unifying or standardized classification methods 
(either conceptual or operational), often with the following consequences: a 
centering of systematic authority within a small number of research experts 
for each taxonomic group; the lack of formalized procedures for corroboration 
or refutation of a proposed classification; the absence of a uniform measure by 
which classifications for different taxonomic groups might meaningfully be 
compared; and the lack of a clear philosophy on precisely what aspects of evo- 
lution a particular classification reflected. 

In the 1960s, explicit concern with these shortcomings of traditional prac- 
tice prompted the rise of numerical taxonomy, or the phenetic approach to sys- 
tematics (Sokal and Sneath 1963). Pheneticists proposed that organisms 
should be grouped and classified according to overall similarity (or its con- 
verse, distance), as measured by defined rules and using as many organismal 
traits as possible. Among the principles guiding numerical taxonomy were the 
following (Sneath and Sokal 1973): the best classifications result from analyses 
based on many organismal features or characters; at least at the outset, every 


BOX 4.1 Terminology and Concepts Relevant to 
Cladistic-Phenetic Discussions 


I, Classes of organismal resemblance 


Phenetic similarity: The overall resemblance between organisms. 


Patristic similarity: The component of overall similarity that is due to 
shared ancestry. : 


Homoplastic similarity (homoplasy): The component of overall similarity that 
is due to convergence from unrelated ancestors. (The term homoplasy is also 
frequently used to describe "extra steps” implied in a phylogenetic network 
beyond those that distinguish taxa in a raw data matrix. In this latter usage, 
homoplasy may arise from convergence, parallelism, or evolutionary rever- 
sals in character states.) 


H. 


= 


Classes of character states used to characterize 

organismal resemblance 

Plesiomorphy: An ancestral character state (i.e., one present in the common 
ancestor of the taxa under study). ' 

Symplesiomorphy: An ancestral character state shared by two or more 
descendant taxa. 

Apomorphy: A derived or newly evolved character state (i.e., one not present 
in the common ancestor of the taxa under study). 


Synapomorphy: A derived character state shared by two or more descendant 
taxa. 


Autapomorphy: A derived character state unique to a single taxon. 
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character is afforded equal weight; classifications should be based on quanti- 
tative measures of overall (phenetic) similarity or distance between taxa; and 
pattems of character correlation can be used to recognize distinct taxa and 
draw systematic inferences. Philosophically, numerical taxonomists aimed to 
develop methods that were “objective, explicit, and repeatable” (Sneath and 
Sokal 1973). Operationally, its practitioners quantitatively estimated phenetic 
similarity, examined character correlations, and grouped taxa accordingly. 
Numerical taxonomy provided a valuable service to science by critically 
scrutinizing traditional systematic practices that had been needlessly 
opaque. Nonetheless, pheneticists were criticized on several fronts (Hull 
1988), most notably by cladists, who proposed an alternative philosophy and 
protocol for phylogenetic reconstruction and classification (Eldredge and 
Cracraft 1980). Under the tenets of the cladistic school, phylogeny should be 
appraised not by overall similarity between organisms, but rather by a sub- 
set of similarity attributable to synapomorphic or shared-derived traits (Box 
4.1; Figure 4.1). Cladists typically focused almost exclusively on the branch- 


Ill. Other relevant definitions 


Monophyletic group or clade: An evolutionary assemblage that includes a 
common ancestor and all of its descendants. 

Paraphyletic group: An artificial assemblage that includes a common ancestor 
and some, but not all, of its descendants. 


Polyphyletic group: An artificial assemblage derived from two or more dis- 
tinct ancestors, 


Outgroup: A taxon phylogeneticaily outside the clade of interest. 
Sister taxa: Taxa stemming from the same node in a phylogeny. 


Phenetic resemblance may be due to patristic and/or homoplastic similarity. 
Patristic similarity may arise from symplesiomorphic and/or synapomorphic 
character states. Cladists attempt to distinguish between symplesiomorphic and 
synapomorphic similarity and to identify clades on the basis of synapomorphies 
only (see Figure 4.1). Pheneticists usually make no such attempts to distinguish 
sources of resemblance. Phylogenetic reconstructions based on either cladistic or 
phenetic principles can be compromised by extensive homoplasy. 

Because cladists try to distinguish symplesiomorphies from synapomor- 
phies, much effort is devoted to the elucidation of evolutionary “polarities” 
(derived versus ancestral conditions) of character states. The following are some 
criteria that have been suggested to indicate primitiveness for a character: 


L Presence in fossils 
2. Commonness among an array of taxa 
3. Early appearance in ontogeny 


4. Presence in an outgroup 


The fourth criterion is most widely employed today, as the others have proved 
misleading or incorrect in many instances (Stevens 1980). 





118 Chapter 4 


Characters and states 
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Figure 4.1 Philosophical rationale underlying Hennigian cladistic attempts to 
distinguish sources of similarity. Shown is the true (but unknown to the 
researcher) phylogeny for taxa A-F and the distribution of observed binary states 
for characters X, Y, and Z. Suppose that taxon A is known to be an outgroup for 
B-E, whose phylogeny is to be reconstructed. Character states x,, y,, and z}, pos- 
sessed by various taxa and the outgroup, are symplesiomorphies (shared ancestral 
states), and hence identify no clades. In particular, y, and z, could be positively 
misleading in amalgamating ingroup members (C-F and B, C, F, respectively) had 
their ancestral status gone unrecognized. Character state y, defines no multi-taxon 
clade because it is an autapomorphy, and z, could be misleading as a putative 
clade marker because it evolved in parallel (convergent) fashion in taxa D and E. 
Only x, is a valid synapomorphy in this example, correctly identifying the true 
clade composed of taxa B, C, and D. 

















ing component (cladogenetic aspect) of evolutionary trees, rather than on 
branch lengths (accumulated changes within lineages through time, or ana- 
genesis). Their ultimate goal was to develop organismal classifications based 
on correctly inferred cladogenesis. 

As sometimes practiced, cladistic approaches themselves were not 
entirely immune from criticism. For example, one widely held belief was that 
“one true synapomorphy is enough to define a unique genealogical relation- 
ship” (Wiley 1981). This sentiment is incorrect, however, if “genealogical 
relationship” is taken to imply organismal relationship (as it often was), 
because it fails to recognize the fundamental distinction between a gene tree 
(i.e., a character phylogeny) and an organismal tree (see below). An unfortu- 
nate consequence was that some cladists occasionally remained dogged in 
advocating putative organismal clades that had received support from only 
one or a few presumptive morphological synapomorphies, even if these con- 
flicted with volumes of other information (e.g., from molecular sequences). 
Thus, unless many independent characters were assayed (as advocated by 
pheneticists), there was a potential danger in cladistics of the kinds of author- 
itarianism that had plagued systematics earlier in the century and had 
prompted the original rise of numerical taxonomy. 
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Indeed, there was little justification for the rancor of cladistic attacks on 
phenetics, for at least two reasons. First, cladistics owed a deep debt to 
numerical taxonomy for opening new discussions and viewpoints on tradi- 
tional systematic practices. Second, cladistic methods as applied to large 
data sets often came rather close to numerical taxonomy's procedures 
(because for such data, character conflicts in clade delineation almost 
inevitably arise, thereby requiring some form of numerical tallying of puta- 
tive synapomorphies). Thus, numerical cladistics and numerical phenetics 
were not as distinct operationally as they at first appeared. The important 
element that cladistics added to systematics was its explicit attempt to dis- 
tinguish between alternative sources of evolutionary similarity and thereby 
account for specified character-state distributions in terms of phylogenetic 
history. 

The original "bible" of the cladistic school, published in 1950 by the 
German entomologist Willi Hennig, was published in an English version 
entitled Phylogenetic Systematics in 1966. In the ensuing decades, cladistic 
methods based on Hennig's insights revolutionized systematic practice as 
applied to traditional taxonomic characters. Cladograms were generated for 
many taxonomic groups, and these typically included explicit descriptions 
of character polarities (ancestral versus derived conditions), character trans- 
formations, and temporal orders of appearance of various morphological, 
physiological, or behavioral character states. Thus, a major advantage of 
cladistic approaches was that they often resulted in testable hypotheses 
regarding evolutionary origins of and conversions among particular char- 
acter states, including molecular ones (Buth 1984; Patton and Avise 1983). 

The philosophical pillar of the cladistic school—that shared-derived 
traits are the appropriate basis for clade delineation—is now almost univer- 
sally accepted. Most questions center instead on operational issues: How 
reliably can synapomorphies be identified? What kinds of characters are 
best used? How are character conflicts resolved when putative clades iden- 
tified by different presumptive synapomorphies disagree? How is a phy- 
logeny to be translated into a classification? All such questions apply to 
molecular markers as well as to phenotypic traits. 

In a historical sense, the cladistic-phenetic debate that began in the 
1960s had at least two important ramifications for molecular evolution, 
which also was beginning its rise then. First, the debate gave renewed ener- 
gy to morphology-based systematics at a time when some traditionally 
trained systematists felt threatened by the emergence of molecular biology. 
One unfortunate consequence of this timing is that molecular and morpho- 
logical approaches to systematics often were viewed as being in opposition, 
a perception with no valid basis. Second, the cladistic-phenetic war, 
although waged primarily in the context of morphology-based systematics, 
occasionally spilled over such that molecular phylogenetics was caught in 
the crossfire. For example, strict Hennigian approaches cannot be applied 
to raw data consisting solely of numerical distance values between taxa, 
and as a result some cladists automatically discredited all such information 
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derived from the important immunological and nucleic acid hybridization 
methods of molecular biology. Attacks also were mounted against the 
widespread practice of summarizing molecular data using some of the 
algorithms of numerical taxonomy, despite the fact that an important 
assumption of phenetic procedures (a constant evolutionary rate across 
phenogram branches) appeared to mesh well with growing independent 
evidence of clocklike behavior for many biological macromolecules. 
Furthermore, many molecular data, even those in the form of qualitative 
character states (such as allozyme alleles, restriction sites, or SNPs) are not 
particularly well suited for strict Hennigian cladistic analysis because of a 
high risk of homoplasy at individual electromorphs or nucleotide character 
states (Straney 1981). 

Apparent conflicts among characters in clade delineation arise prima- 
rily from homoplasy (see Box 4.1). To minimize the number of ad hoc 
hypotheses required to resolve character conflicts along a phylogeny, prin- 
ciples of “maximum parsimony” were soon developed as an extension of 
cladistic principles (Felsenstein 1983; Sober 1983). As applied to phyloge- 
netic inference, parsimony algorithms operate by estimating evolutionary 
trees of minimum total length (i.e., trees that minimize the number of evo- 
lutionary transformations among character states required to explain a 
given data set). Although general notions of parsimony (as in Occam's 
razor) have long been a key part of much biological reasoning, the emer- 
gence of cladistic philosophy provided an important historical step in the 
further elaboration of parsimony and other approaches in molecular phy- 
logenetic reconstruction. 


Molecular Clocks 


Zuckerkandi and Pauling (1965) were the first to propose that particular 
proteins and DNA sequences might evolve at roughly constant rates over 
time, and might thereby provide internal biological timepieces for dating 
past evolutionary events. The concept of molecular clocks fits well with neu- 
trality theory because, as discussed in Chapter 2, the rate of neutral evolu- 
tion in genetic sequences is, in principle, equal to the mutation rate to neu- 
tral alleles. However, clock concepts need not be incompatible with selection 
scenarios: If large numbers of assayed genes are acted upon by multifarious 
selection processes over long periods of time, short-term fluctuations in 
selection intensities could tend to average out, such that overall magnitudes 
of genetic distance between taxa might well correlate strongly with times 
elapsed since common ancestry. 

Few concepts in molecular evolution have been more contentious (or 
abused) than molecular clocks. At the outset, several general points must be 
understood. First, the debate is not whether molecular clocks behave metro- 
nomically, like a working timepiece—they do not. If molecular clocks exist, 
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both neutralists and selectionists predict at best a “stochastically constant” 
behavior, somewhat like radioactive decay (Ayala 1982c; Fitch 1976). 
Second, not all genealogical applications for genetic markers hinge critical- 
ly on the reliability of molecular clocks. For example, genetic characters like- 
ly to be of monophyletic origin (such as specific gene duplications, changes 
in gene arrangement, or SINE insertions) provide powerful phylogenetic 
markers regardless of rate of evolution at the DNA sequence level; and 
appraisals of genetic identity, parentage, and kinship hardly depend at all 
on a steady pace of DNA sequence evolution. 

Third, most tree-building algorithms (see next section) relax assump- 
tions of rate homogeneity among genes and lineages. In principle and prac- 
tice, branching orders in phylogenies can be inferred directly from distribu- 
tions of qualitative character states using techniques such as cladistic, parsi- 
mony, or maximum likelihood analyses, which remain valid irrespective of 
whether molecules evolve in strictly time-dependent fashion. Even when 
genetic distance matrices are the starting point for phylogenetic reconstruc- 
tions, most tree-building methods can accommodate heterogeneity in 
molecular evolutionary rates at least to some extent. 

A fourth basic point about molecular clocks is that different DNA 
sequences empirically do evolve at markedly different rates (see review in 
Graur and Li 2000). Rate heterogeneity is apparent at many levels: across 
nucleotide positions within a codon (where, for example, mean rates for 
synonymous substitutions normally are several times higher than for non- 
synonymous substitutions involving changes in protein-coding regions; 
Figure 4.2A,B); among non-homologous genes within a lineage (in which 
non-synonymous rates can vary by orders of magnitude, as between the 
slowly evolving histones and rapidly evolving relaxins; Figure 4.2A); 
among classes of DNA within a genome (e.g., introns and pseudogenes 
evolve more rapidly than do non-degenerate sites in protein-coding genes; 
Figure 4.2C); and among different genomes within an organismal lineage 
(for example, synonymous substitution rates in cpDNA are severalfold 
lower than mean rates in plant nuclear genomes, and mtDNA in many ver- 
tebrate animals evolves about 5-10 times faster than typical single-copy 
nuclear DNA). 

Under neutrality theory (Kimura 1983), such rate heterogeneity across 
nucleotide sites, genes, and genomes within a phylogenetic lineage is inter- 
preted to reflect varying intensities of purifying selection associated with dif- 
fering levels of functional constraint on DNA sequences, perhaps in conjunc- 
tion with variation in the underlying rate of mutation to strictly neutral alle- 
les (Britten 1986; but see Kumar and Subramanian 2002). Ironically, extreme 
rate heterogeneity can be highly beneficial for phylogenetic studies by per- 
mitting choice of appropriate DNA regions geared to the time frame of a par- 
ticular phylogenetic problem (see Box 3.4). For example, the slowly evolving 
sequences of ribosomal RNA genes have been extremely informative in 
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Figure 4.2 Estimated rates of nucleotide substitution in various genes and gene 
regions. (A) Non-synonymous and (B) synonymous substitution rates in eight pro- 
tein-coding genes sequenced from humans and rodents. Calculations are based on 
the assumption that observed sequence differences accumulated over 80 million 
years. (C) Average substitution rates in different parts of these and other genes. 
Non-degenerate nucleotide sites are those at which all possible substitutions are 
non-synonymous. Át fourfold degenerate sites, all possible substitutions are syn- 
onymous; at twofold degenerate sites, one of three possible nucleotide changes is 
synonymous and the other two are non-synonymous. (Data from Li and Graur 
1991; see also Graur and Li 2000; Li 1997.) 


reconstructing deep branches in the Tree of Life (see Chapter 8), whereas rap- 
idly evolving mtDNA sequences have revolutionized phylogeographic stud- 
ies of animals at the intraspecific level (see Chapter 6). Within a given gene 
or molecule, different classes of sites are also informative over different time 
frames. In animal mtDNA, for example, slowly accumulating replacement 
substitutions and transversions are often most useful for addressing phy- 
logeny at the level of species, genera, or taxonomic families, whereas rapid- 
ly accumulating silent substitutions and transitions provide many more 
markers for analyzing local populations within a species. 
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History of clock calibrations and controversies 


For purposes of phylogenetic reconstruction, an even more controversial 
and problematic form of rate heterogeneity involves differences in the evo- 
lutionary tempo of homologous DNA sequences across organismal lineag- 
es. For securely dating past separation events, approximate uniformity in 
evolutionary rates would clearly be desirable (A. C. Wilson et al. 1987). 
Indeed, standard rate calibrations have been suggested for a variety of 
genes and assay methods (Figure 4.3), but in general, these have also been 
criticized as being far less than universally applicable. 


ANIMAL MITOCHONDRIAL DNA AS A PROTOTYPE. A conventional calibration 
for the evolutionary rate of animal mtDNA (derived from the initial slope of 
the linear portion of the divergence curve; Figure 4.3A) is about 2% sequence 
divergence per million years (or 2 x 10% substitutions per site per year) 
between pairs of mammalian lineages that have been separated for less than 
10 million years (Brown et al. 1979). In referring to mtDNA in higher animal 
taxa, Wilson et al, (1985; see also Shields and Wilson 1987a) concluded that 
^no major departures from this rate are known for the molecule as a whole." 

However, this conclusion became controversial for the following empir- 
ical reasons. First, ratios of mean mtDNA /scnDNA divergence rates soon 
were shown to differ significantly between animal taxa (Caccone et al. 
1988a; DeSalle et al. 1987; Powell et al. 1986; Vawter and Brown 1986), 
although whether this was due to rate variation in mtDNA, scnDNA, or 
both was unclear. Second, different nucleotide positions and genes within 
mtDNA were shown to evolve at varying rates within a lineage (Brown et 
al. 1982; Gillespie 1986; Moritz et al. 1987), and particular mtDNA genes 
(such as cytochrome oxidase) reportedly showed rate differences as high as 
fivefold across taxa (Brown and Simpson 1982; Crozier et al. 1989). Third, 
many researchers began to report significant variation in mean mtDNA evo- 
lutionary rates among a variety of organismal lineages, including different 
mammalian orders (Hasegawa and Kishino 1989), vertebrate classes (Avise 
et al. 1992a; Bowen et al. 1993a), homeothermic versus heterothermic verte- 
brates (Kocher et al. 1989), major fish groups (Krieger and Fuerst 2002), dif- 
ferent clades of Hawaiian Drosophila (DeSalle and Templeton 1988), and var- 
ious invertebrate taxa such as scleractinian corals vis-à-vis the vertebrate 
norm (Romano and Palumbi 1996; van Oppen et al. 1999). 

In attempts to understand such apparent differences among animal 
groups, some authors noted that mtDNA evolutionary rates seemed to be 
positively correlated with organisms' basal metabolic rates and negatively 
correlated with generation lengths or body sizes (A. P. Martin et al. 1992; Rand 
1994; Thomas and Beckenbach 1989). A related hypothesis involved the con- 
cept of nucleotide generation time: the mean absolute time elapsed between 
successive episodes of DNA replication or repair (Martin and Palumbi 1993). 
The idea was that nucleotide positions with higher turnover (shorter replica- 
tion intervals) might be subject to more mutational opportunities per unit of 
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Figure 43 Early examples of clock calibrations reported for various types of 
molecular genetic data. All dates along the abscissa came from fossil or biogeo- 
graphic evidence. (A) Dark line: mtDNA sequence divergence for various mam- 
mals (after Brown 1983). The slope of the linear portion of this curve gives the con- 
ventional mtDNA clock calibration of 2% sequence divergence per million years 
between recently separated lineages. Beyond about 15-20 million years, mtDNA 
sequence divergence begins to plateau, presumably as the genome becomes satu- 
rated with substitutions at the variable sites. Light gray line: percentage (right- 
hand axis) of observed mtDNA transitions for various mammals (open squares; 
after Moritz et al. 1987) and Drosophila (solid squares; after DeSalle et al. 1987). 

(B) Albumin immunological distances (as estimated by micro-complement fixa- 
tion) for various carnivorous mammals and ungulates. (After Wilson et al. 1977.) 
(C) Accumulated codon substitutions in seven proteins (cytochrome c, myoglobin, 
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Sidereal time, such that organisms with higher metabolic rates and briefer life 
spans might tend to have shorter nucleotide generation lengths and higher 
absolute rates of molecular evolution. Additional possibilities (not mutually 
exclusive) are that species might differ in their inherent fidelity of mtDNA 
replication, that DNA repair mechanisms (which are known to be deficient in 
mitochondria) might vary across taxa, that differences in mtDNA base com- 
position and codon usage across taxa contribute to variation in evolutionary 
rates, and that mtDNA in different organisms might be exposed to different 
concentrations of mutagens (such as oxygen radicals generated within mito- 
chondria). These and other possibilities and their effects on phylogenetic esti- 
mation are reviewed by Mindell and Thacker (1996) and Arbogast et al. 
(2002). 

One difficulty in wholeheartedly accepting any such hypothesis by 
itself is that at least a few "troubling" exceptions seem to crop up in all sus- 
pected trends. For example, Stanhope et al. (1998a) reported that mitochon- 
drial genes in some long-generation mammals, such as elephants and 
humans, evolve faster than their homologues in short-generation rodents. 
Likewise, any correlation between metabolic and evolutionary rates seems 
to be less than universal when mtDNA sequences from widely diverse taxa 
are considered (e.g., Seddon et al. 1998). The bottom line is that inter-taxon 
differences in the pace of mtDNA nucleotide evolution remain poorly 
understood mechanistically, and multiple interacting factors probably con- 
tribute to the outcomes. 


OTHER GENES AND GENOMES. Examples of reported calibrations for several 
other putative molecular clocks are summarized in Figure 4.3. Again, con- 
troversies soon arose over the validity and universality of such results, par- 
ticularly as applied across taxonomic groups. The inherent appeal of molec- 
ular clocks occasionally led to some egregious claims. For example, the early 
protein electrophoretic literature conveyed a strong impression that 
allozyme distances reliably date speciation events. Many papers concluded 


a- and fj-hemoglobin, fibrinopeptides A and B, and insulin) for various mam- 
malian species. The three open squares involve primate comparisons. (After 
Langley and Fitch 1974; Nei 1975.) (D) Codon substitutions per locus (Nei's D) 
based on allozyme comparisons of carnivores (solid circles) and primates (open 
squares). (After Wayne et al. 1991a.) (E) AT „ values from DNA-DNA hybridization 
involving carnivores (closed circles) and primates (open squares). (After Wayne et 
al. 1991a.) (F) Percent sequence divergence in 16S ribosomal DNA for various 
eubacterial forms: (a) cyanobacteria; (b) chloroplasts; (c) microaerophiles; (d) mito- 
chondria; (e) obligate aerobes; (f) Photobacterium; (g) Rhizobium and Bradyrhizobium; 
and (h) Escherichia. Wide horizontal bars indicate high uncertainty about diver- 
gence times from nonmolecular evidence (the slope of the line is arbitrarily drawn 
to represent an evolutionary rate of 1% sequence divergence per 50 million years). 
(After Ochman and Wilson 1987.) 
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that observed genetic distances were consistent with suspected separation 
times for particular species as gauged by nonmolecular evidence (such as 
fossil ages or the presence of geographic barriers). It turns out, however, that 
different authors had employed (perhaps unwittingly) allozyme rate cali- 
brations that differed from one another by more than 20-fold (see review in 
Avise and Aquadro 1982). Given such a huge range of potential clock cali- 
brations in the literature, it is difficult to imagine any observed allozyme dis- 
tance that could not have been accommodated with a given fossil-based or 
biogeography-based scenario. Ironically, if these researchers individually 
were correct that a molecular clock had been ticking within each taxonomic 
group, then collectively no single allozyme clock could apply across these 
same taxa. Ayala (1986) provided another early summary of evidence for the 
erratic behavior of particular protein clocks across lineages, as did Britten 
(1986) and Brunk and Olson (1990) for overall rates of scnDNA sequence 
divergence as gauged by DNA-DNA hybridization. 

Some investigators concluded that although mean molecular rates in 
nuclear genomes vary among taxa, they do so in a predictable or at least con- 
sistent fashion. For example, molecular evolution in nDNA was argued to be 
slower in primates than in rodents (in contrast to mtDNA) and was thought 
to be especially slow in hominoids (Goodman et al. 1971; Koop et al. 1989; Li 
and Tanimura 1987; Li et al. 1987; Maeda et al. 1988; but see also Caccone and 
Powell 1989; Easteal 1991; Kawamura et al. 1991; Sibley and Ahlquist 1987). 
Several published interpretations were similar to those presented above for 
mtDNA. For example, Catzeflis et al. (1987) and Sibley et al. (1988) present- 
ed evidence that short generation times or increased numbers of germ line 
cell divisions were associated with higher mean rates of scnDNA sequence 
evolution in birds and mammals [thereby contradicting Sibley and Ahlquist's 
(1984) prior advocacy of a “uniform rate of DNA evolution”]. 

Other researchers provisionally interpreted apparent variation in 
nuclear substitution rates among taxa to differences in generation length 
(Gaut et al. 1992; Kohne 1970; Laird et al. 1969; Li et al. 1987), numbers of 
DNA replications in germ line cells (Hurst and Ellegren 1998; Wu and Li 
1985), repair efficiencies during DNA replication (Britten 1986), or magni- 
tudes of exposure to mutagenic agents including free radicals, whose level 
of DNA damage appears to be correlated with metabolic rate differences 
among species (Adelman et al. 1988). Most such proposed factors presum- 
ably would tend to be more similar in closely related than in distantly relat- 
ed taxa, producing a "phylogenetic legacy" that is both good and bad for 
molecular analyses. On the positive side, such a historical legacy opens 
interesting possibilities for estimating "the rate of evolution of the rate of 
molecular evolution" within a phylogenetic tree (Thorne et al. 1998). On the 
negative side, it means that rate estimates from different pairs of species are 
statistically non-independent, and also that molecular clock calibrations are 
likely to be “local” rather than universal (Sanderson 2002). 

On the other hand, some leading researchers consistently maintained 
that sidereal time was the single best predictor of genetic divergence, and 





| Philosophies and Methods of Molecular Data Analysis 127 


that molecular clocks could be calibrated universally across taxa (Wilson 
1985). As stated by A. C. Wilson et al. (1987), "Molecular evolutionary clocks 
have ticked at much the same rate per year in many eubacterial genes as in 
the nuclear genes of animals and plants." These authors also proposed an 
intriguing explanation for this conclusion based on the following assump- 
tions: most nucleotide substitutions involve neutral mutations; the mutation 
rate per year is higher in short-generation organisms; the fraction of effec- 
tively neutral mutations is lower in larger populations (because of the 
greater effect there of deterministic forces, including natural selection); and 
species with shorter generations tend to have larger populations. To the 
extent that these assumptions hold, a greater mutation rate in short-genera- 
tion species might be counterbalanced by a lower fraction of effectively neu- 
tral mutations, such that molecular evolutionary rates overall would remain 
fairly constant among diverse taxa. 

A somewhat related proposal is that if a great many genes are surveyed 
in the taxa under consideration, any rate variation across loci may tend to 
average out by the law of large numbers. Thus, even if large statistical errors 
are associated with single-gene sequences, reliable estimates of divergence 
times should emerge from multi-locus appraisals. In one empirical explo- 
ration of this thesis, Kumar and Hedges (1998) found excellent agreement 
between fossil-based estimates of divergence times for major vertebrate lin- 
eages and combined molecular divergence estimates from sequence 
appraisals of more than 650 nuclear genes (Figure 4.4). 
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Figure 4.4 Agreement between composite molecular estimates and fossil-based 
estimates of divergence times for major vertebrate lineages. Composite molecular 
estimates were derived from 658 nuclear genes. (After Kumar and Hedges 1998.) 
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Absolute and relative rate comparisons 


Interpretive controversies aside, how are molecular evolutionary rates 
assessed, and how is it that debates about rate heterogeneity continued for 
so long without final resolution? One difficulty is that molecular distance 
measures often become nonlinear with time (due to "saturation effects") at 
increased evolutionary depths (e.g, Figure 4.3A,D). Some arguments 
against molecular clocks have stemmed from appraisals at inappropriate 
regions of the divergence curves. Another problem is the difficulty of deter- 
mining confidence limits for genetic distance estimates (Nei 1987). Thus, 
any distance in Figure 4.3 is a point value with (often large) estimation 
errors. Another aspect of statistical concern is whether, for a given mole- 
cule, the mean substitution rate among lineages is equal to the variance in 
rate, as predicted under neutrality theory for a Poisson-like process. 
Several early studies concluded that the empirical variance in rates some- 
what exceeds the mean (Langley and Fitch 1974; Ohta and Kimura 1971), 
and this was interpreted as evidence against a uniform clock (Gillespie 
1986, 1988; Takahata 1988). 

Perhaps the most serious difficulty in calibrating molecular distance 
against sidereal time is that firm independent knowledge from fossil or bio- 
geographic evidence is also required, at least initially Unfortunately, such 
information is uncertain or lacking for many taxa (Kidwell and Holland 
2002). Indeed, if this were not true, little motivation would exist for phylo- 
genetic reappraisals based on molecular data. All separation dates in Figure 
4.3, for example, came from fossil evidence, and the range in estimates of 
divergence time was in some cases extremely wide (e.g., Figure 4.3F). Fossils 
seldom provide the solid ground-truthing of separation dates that molecular 
biologists sometimes suppose because preserved remains are often scanty 
and confined to a few phenotypic attributes whose phylogenetic relevance is 
suspect. These problems are especially acute for morphologically simple 
creatures such as bacteria. Furthermore, even under the best of preservation 
circumstances, the earliest known appearance of a fossil provides only a min- 
imum date for the true evolutionary origin of the lineage that it represents. 

Biogeographic evidence can also be difficult to interpret, even in the 
cleanest of instances. To cite one example, it is well documented that the 
Isthmus of Panama rose above the sea about 3 million years ago, and it must 
therefore have curtailed any former gene flow between tropical marine fau- 
nas in the eastern Pacific and western Atlantic oceans after that time. Today, 
green turtles (Chelonia mydas) are circumtropically distributed, but their pop- 
ulations show a clear genealogical distinction in mtDNA between the 
Atlantic and Pacific (see Chapter 6). The estimated magnitude of net mtDNA 
sequence divergence between these two clades (0.6%), however, is tenfold 
lower than expected, assuming that the “conventional” mammalian mtDNA 
evolutionary rate (2% sequence divergence per million years; Figure 4.3A) 
has applied across the supposed 3 million years of evolutionary separation. 
One possibility is that mtDNA evolution in these turtles is slower than in 
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Figure 4.5 Relative rate test. (A) The rationale behind the relative rate test. (B, C) 
Potential difficulties of this test. (See text for details.) 


mammals by an order of magnitude (Avise et al. 1992a). Or perhaps turtle 
mtDNA evolves at the standard mammalian pace, but Atlantic and Pacific 
populations were in recent genetic contact via dispersal of animals around 
South America or Africa during interglacial episodes of the Pleistocene. 
Similar molecular surveys have been conducted on more than 30 marine 
"geminate" species pairs inhabiting the Atlantic versus Pacific sides of 
Panama (see review in Lessios 1998). These pairs include numerous fishes 
(Bermingham et al. 1997; Grant 1987; Vawter et al. 1980), sea urchins 
(Bermingham and Lessios 1993; Lessios 1979, 1981), and shrimps (Knowlton 
et al. 1993). However, even these extensive molecular analyses, conducted in 
a superbly favorable biogeographic setting, have yielded uncertain conclu- 
sions about the magnitudes of possible rate variation in homologous classes 
of genetic markers (Bermingham et al. 1997; Marko 2002). 

To circumvent the requirement of firm separation dates from fossil or 
biogeographic evidence, molecular evolutionists also developed relative 
rate tests that do not depend on knowledge of absolute divergence times 
(Margoliash 1963; Sarich and Wilson 1973). Each test requires at least two 
related species (X and Y) and an outside reference species (Z) that branched 
off prior to the separation of X and Y. The rationale is illustrated in Figure 
4.5A. By definition, the true evolutionary distance between X and Y (d,y) is 
equal to the sum of their branch lengths from a common ancestor at point O 
(i.e., dx, = doy + doy). Similarly, 


dy; = dox + dog 
or rearranged, 

doy = dyz - dog (4.1) 
and 

dy; = doy + doz 


or rearranged, 


doy = dyz - doz 2 
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Subtracting Equation 4.2 from Equation 4.1 yields 
dox - doy = dxz - dyz (4.3) 


According to a molecular clock, doy and doy should be equal (doy - doy 
= 0) and, hence, dyz = dyz. Genetic distances dox and doy cannot be meas- 
ured empirically, but dy, and dyz can. The relative rate test asks whether dyz 
and dy, as estimated by genetic data are sufficiently different as to be incom- 
patible with a strict molecular clock. 

By similar logic, branches leading to extant sister taxa X and Y can also 
be deduced from empirical data to have different (or perhaps identical) 
lengths, even though neither of their genetic distances to the common ances- 
tor can be measured directly. Suppose the following hypothetical distances 
were observed among three extant taxa: dyy = 0.08, dyz 20.19, and dyz = 0.17. 
Solution of three equations with three unknowns (using the logic of Fitch 
and Margoliash 1967) yields the desired branch lengths as follows: 


dox + doy = 0.08 


plus 
dox + doz = 0.19 
yields 
(2dox + doy + doz = 0.27) 
minus 
(doy + dog = 0.17) 
yields 


2dox = 0.10 or dox = 0.05 


The unique solutions are dox = 0.05, doy = 0.03, and dog = 0.14. Note that dox 
and doy differ despite the fact that the same amount of time has elapsed 
since X and Y shared a common ancestor. Note also that the sums of all 
branch Jengths in this little tree agree perfectly with the empirical distances 
between X, Y, and Z, meaning that the data have not been distorted in the 
reconstructed phylogeny. However, this seldom proves true when more 
than three taxa are considered. 

Many relative rate tests for molecular data have been conducted (Graur 
and Li 2000). For example, based on their DNA-DNA hybridization studies, 
Sibley and Ahlquist (1986) reported that “genetic distances between the out- 
lier and each of the other species ... are always equal, within the limits of 
experimental error" and that "thousands of such trios of species ... yield the 
same result and attest to the uniform average rate of the DNA clock in 
birds." Mice (Mus), rats (Rattus), and hamsters (Cricetulis and Mesocricetus) 
were likewise among the many organisms that provisionally passed relative 
rate tests early on (Li et al. 1987; O'hUigin and Li 1992). On the other hand, 
there have also been many failures to pass relative rate tests, and these fail- 
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ures have contributed to such conclusions as that nDNA sequences evolve 
faster in rodents than in primates (Li et al. 1987; Wu and Li 1985) and that 
particular cpDNA sequences evolve more rapidly in annual plants than in 
perennials (Bousquet et al. 1992; Gaut et al. 1992). 

Relative rate tests are not without difficulties, however, as illustrated in 
Figure 4.5. If X and Y separated very recently compared with their separa- 
tion from Z (Figure 4.5B), dyz and dyz may appear equal in an empirical test 
even under a highly erratic clock (because the vast majority of evolution has 
taken place in the long and shared OZ branch). Conversely, if the separation 
of X and Y was temporally close to that of their common ancestor with Z 
(Figure 4.5C), an incorrect assumption about the branching order of the 
three species might exist, such that dyz and dy, could appear different even 
under a nearly constant clock. Thus, errors both of false acceptance and false 
rejection of evolutionary clocks can be envisioned in particular tests of rela- 
tive rates. Additional nuances of the relative rate test are discussed in Tajima 
(1993) and Graur and Li (2000). 


Closing thoughts on clocks 


What can be concluded from the vast effort expended on assessment of 
molecular clocks? It is now undeniable that some, and probably most, 
molecular systems evolve at heterogeneous rates, not only among classes of 
sites within a given molecule but also across taxonomic groups. Thus, if pre- 
cise clocks exist, they are local rather than universal, both with respect to 
different classes of molecular features and different phylogenetic lineages. 
Nonetheless, time elapsed since common ancestry remains an important, 
and arguably the single best, predictor of molecular divergence, especially 
when genetic distance is measured across large numbers of loci. What justi- 
fies this latter statement? The evidence is mostly indirect, inconclusive in 
individual instances, but cumulatively compelling. First, nodes in numer- 
ous molecular phylogenetic trees generally seem to be at least roughly con- 
sistent with independent time estimates (see Figures 4.3 and 4.4), uncertain 
as these dates of separation may sometimes be. Second, molecular evolution 
often proceeds mostly independently of morphological and phenotypic evo- 
lution, in which rates can vary wildly under the influence of different selec- 
tion regimes. Finally, given current understanding of the mechanisms 
underlying DNA sequence evolution, it would be most surprising if mean 
genome-wide sequence divergence did not generally tend to increase with 
time. 
In the near future, it will become commonplace to base phylogenetic 
conclusions on DNA sequences from many genes, rather than just one or a 
few genes. The study by Kumar and Hedges (1998), which analyzed more 
than. 650 nuclear genes to estimate phylogeny for mammalian orders and 
j major vertebrate lineages (see Figure 4.4), was among the pioneering efforts 
in this regard, but as data from DNA sequences exponentially increase, such 
massive composite analyses may soon be standard practice in the field of 
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molecular phylogenetics. Such analyses should. tend to smooth the high rate 
variation often seen in the individual genes that traditionally were consid- 
ered one at a time in phylogenetic appraisals. 

Furthermore, in many instances, even molecular timepieces with less 
than full precision can provide significant improvements over phylogenetic 
understanding gained from nonmolecular data. Consider, for example, sin- 
gle-celled microbes, whose phylogeny was totally unknown prior to the 
application of molecular information. Phylogenetic patterns in rRNA genes 
and other loci have revealed stunning genetic relationships and subdivi- 
sions among microbial taxa, including those that early in evolution entered 
into endosymbiotic relationships with proto-eukaryotic cells (see Chapter 
8). Molecular clocks keep far from perfect time, but to dismiss the inherent 
time-dependent nature of molecular evolution out of hand would be to 
deny empirical access to an invaluable, and sometimes the sole, source of 
information on temporal history. 


Phylogenetic Reconstruction 


Phylogenetic trees (rooted) or networks (unrooted) are graphical representa- 
tions consisting of nodes and branches (pathways connecting nodes) that 
summarize evolutionary relationships among particular taxa (Figure 4.6). In 
most molecular studies, the units of phylogenetic analysis (operational taxo- 
nomic units, or OTUs) are species or higher taxa (actually, the analyzed genet- 
ic material that they house), but they can also be conspecific populations, indi- 
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Figure 4.6 Alternative representations of phylogenetic relationships for six 
extant taxa (A-F). Left: Unrooted network with scaled branches. Right: Rooted tree 
(the heavy line is the root) with branches that are only roughly scaled. Internal 
nodes in both drawings are indicated by black dots. Note that branch angles have 
no meaning because branches may be rotated freely about any internal node with- 
out materially affecting network or tree topology. 
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vidual organisms, or non-recombined alleles of a specific gene, such as 
mtDNA. The key requirement or assumption underlying a phylogenetic rep- 
resentation is that tree branches be mostly non-reticulate or non-anastomose 
(Chapter 8 describes exceptions), such that the tree faithfully captures ances- 
tral-descendant evolutionary history. External nodes in a phylogenetic tree or 
network are normally extant OTUs, and internal nodes are deduced ancestral 
units. Peripheral branches lead to external nodes; interior branches connect 
internal nodes. Branch lengths reflect the number of evolutionary changes 
along each ancestral-descendant pathway. If the genetic distance between 
each pair of OTUs exactly equals the sum of the branch lengths connecting 
that pair the tree is said to be strictly additive (Waterman et al. 1977). 
Departures from additivity provide one measure of the degree to which a 
depicted phylogeny may have been distorted by homoplasy (convergence, 
parallelisms, or reversals) in the molecular data, or perhaps by improper 
behavior of the distance measure or the phylogenetic algorithm employed. 

Phylogenetic representations may be graphed in several ways (see 
Figure 4.6). A tree or network is scaled when its branches are proportional 
in length to the numbers of genetic changes along them; otherwise, it is 
unscaled or partially scaled (although branch lengths may be indicated 
numerically along the diagram). A tree is rooted when an internal node is 
specified that represents the common ancestor of all OTUs under study; oth- 
erwise, the diagram is unrooted and is commonly referred to as a network. 
A. tree is bifurcating when two immediate descendant lineages come from 
each node, and multifurcating when three or more lineages do so. 

The process of estimating a phylogenetic tree can be remarkably chal- 
lenging, in large part because even small numbers of OTUs can, in principle, 
be connected by astronomical numbers of different trees, only one of which 
is presumably the correct representation of actual evolutionary history. For 
n OTUS, the theoretical number of different bifurcating unrooted networks 
(Nau) is 

2n -5)! 
Nru = IEEE (4.4) 


and the theoretical total number of different bifurcating rooted trees (N^) is 
(2n - 3)! 


Nau 230-2) (4.5) 
(Felsenstein 1978a). Thus, the number of possible tree structures increases 
dramatically as the number of taxa increases, and even the small value of n 
= 10 yields Nj, = 34,459,425. 

Many phylogenetic algorithms work by searching among possible trees 
for those that exhibit desirable properties according to some specified opti- 
mality criterion (e.g., shortest total branch length under parsimony). 
However, it is currently impossible for even the faster computers to exhaus- 
tively search all possible trees when n is moderate or large. Thus, truncated 
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search procedures must be implemented with the hope that they will ade- 
quately explore the vast parameter space to identify correct tree(s) accord- 
ing to some specified optimality criterion. A second complication is that the 
optimality criterion itself may or may not be a valid representation of evo- 
lution. A third fundamental difficulty is that a true phylogeny is seldom 
known with certainty from independent evidence (Atchley and Fitch 1991 
and Hillis et al. 1992 describe rare exceptions), so empirical appraisals of 
alternative phylogenetic methods normally rest on indirect evidence. The 
net result of these difficulties in evaluating alternative tree-building 
approaches has been a continuing debate over which method of phyloge- 
netic reconstruction is "best." 

A vast scientific literature, far beyond the scope of this book, addresses 
the merits and demerits of a wide variety of tree-building procedures for 
molecular (and other) data. Some key books on the topic include those by 
Hall (2004), Hillis et al. (1996), Li (1997), and Nei and Kumar (2000), to 
which interested readers are referred. What follows are merely brief descrip- 
tions and the rationales of the most commonly used algorithms. 


Distance-based approaches 


All distance-based approaches begin with an OTU x OTU matrix, the body 
of which consists of estimated pairwise genetic distances between taxa 
(Farris 1972). For n OTUs, there are n (n - 1)/2 such distances (excluding 
"self" comparisons along the matrix diagonal). Clearly, because OTUs have 
phylogenetic connections, such estimates cannot be treated as independent 
values in a statistical sense. Indeed, the historical connections that genetic 
distances register are the primary focus. Table 4.1 presents a hypothetical 
distance matrix for five OTUs (ten pairwise comparisons) that will serve to 
illustrate two of the most widely used distance algorithms for constructing 
phenograms (also called dendrograms or, loosely, "trees"). 


UPGMA CLUSTER ANALYSIS. Cluster analyses (of which there are several 
variants), which group OTUs according to overall similarity or distance, are 
the simplest methods computationally (Sneath and Sokal 1973). Under the 
^unweighted pair group method with arithmetic averages" (UPGMA), a 
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figure 4.7 Phenograms produced from the genetic distance matrix in Table 4.1 
by various distance-based approaches: (A) UPGMA dendrogram; (B) neighbor- 
joining network, unrooted; (C) neighbor-joining tree, rooted and right-justified. 


distance matrix (such as that in Table 4.1) is scanned for the smallest dis- 
tance element, and the OTUs involved are joined at an internal node drawn 
in an appropriate position along a distance axis (Figure 4.7A). In this exam- 
ple, OTUs A and B are joined first, at distance level d = 0.04 (because the sum 
of branch lengths connecting A and B is the observed d = 0.08). This distance 
element in the matrix is then discarded. The matrix is scanned again for the 
smallest remaining distance, which in this case is d = 0.12 between D and E. 
These OTUs are clustered at level d = 0.06. The next smallest distance in the 
matrix is d = 0.17 between C and B. However, B is already part of a previ- 
ously formed cluster with A, so C cannot be joined directly to B, but rather 
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must be connected through the A-B internal node. This clustering level is 
determined by the arithmetic mean of the distances between C and OTUs in 
the previous cluster {d = (0.19 + 0.17) / 2 = 0.18]. Thus, C joins the A-B group 
at d = 0.09. All that remains is to join the A-B-C cluster with the D-E clus- 
ter, at a depth determined from the mean of all pairwise distances between 
the OTUs they contain [d = (0.70 + 0.65 + 0.75 + 0.70 + 0.80 + 0.60)/6/2 = 
0.35]. Note that this exhausts all distances in the matrix. The final UPGMA 
phenogram is shown in Figure 4.7A. 

Some points about UPGMA should be clarified. In each cycle of the clus- 
tering procedure, OTUs or previously formed clusters are grouped accord- 
ing to the smallest mean distance between the taxa involved (rather than 
smallest single distance element remaining in the matrix). Each OTU con- 
tributes equally to these mean distances (hence the term “unweighted”). All 
extant OTUs are depicted as “right-justified” along a genetic distance axis 
(Figure 4.7A). Finally, the dendrogram is implicitly rooted at the point 
where the deepest clusters join. The major assumption (and practical limita- 
tion) of UPGMA clustering is that evolutionary rates are equal along all den- 
drogram branches. Even so, UPGMA often performs surprisingly well in 
recovering proper tree structures in computer simulation tests (Nei et al. 
1983; Sourdis and Krimbas 1987; Tateno et al. 1982). This seems to be 
because estimates of genetic distance are subject to large stochastic errors, 
and the distance-averaging feature of UPGMA tends to reduce these effects 
(Nei 1987). 


NEIGHBOR-JOINING METHOD. The neighbor-joining (^N-]") method (Saitou 
and Nei 1987) is conceptually related to cluster analysis, but allows for 
unequal rates of molecular change among branches. It does so by construct- 
ing, at each step of the analysis, a transformed distance matrix that has the 
net effect of adjusting branch lengths between pairs of nodes on the basis of 
mean divergence from all other nodes. The procedure is detailed in Box 4.2, 
again using the distance matrix in Table 4.1. The resulting unrooted net- 
work, showing deduced branch lengths as well as topology, is shown in 
Figure 4.7B. This network can also be rooted, for example by placing an 
ancestral node at the midpoint of the longest total set of branch lengths 
between extant OTUs (Figure 4.7C). Note the close similarity in this exam- 
ple between the structures of the N-J and UPGMA phenograms (Figure 
4.7A,C). This kind of agreement between alternative clustering algorithms is 
commonly observed for real molecular data sets. Readers interested in fur- 
ther details and the formal steps of the N-J operation, which is probably the 
most popular of the distance approaches in use today, should consult 
Studier and Keppler (1988) or Swofford et al. (1996). 


COMPARISON OF DISTANCE MATRIX METHODS. Much debate has concerned 
which of these (or other) distance algorithms produces the “best” phenogram. 
One basis for this choice is goodness of fit, a measure of how well the inferred 
distances in the “tree” match the empirical distance values in the original 
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matrix. In a molecular phylogenetic context, Prager and Wilson (1978) were 
among the first to apply one such measure: 
$ 
10011, - O1 
E= izl 


Ys 


izl 


(4.6) 


Box 4.2 Cycling Operation of the Neighbor-Joining 
Algorithm 





Each value of r is the sum of observed distances between the OTU. of that row 
and other extant OTUs or nodes. All values were rounded to two decimal points. 





A B c D E. r r/3 
A — 0.08 9.19 0.70 0.65 1.62 0.54 
B -10 — 0.17 0.75 0.70 1.70 0:57 
C 0.94 -(.99 — 0.80 0.60 1.76 0.59 
D -0.63 -0.61 -0.58 — 0.12 2:37 0.79. 
E  -058 -0.56 —0.68 ~1.36 ex 2.07 0.69 





Distance D to node 1 = 0.12/2 + (0.79 — 0.69)/2 = 0:11. 
Distance E to node 1 = 0.12 — 0.11 = 0.01 


Node 1 r r2 
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A = 0.08 0.19 0.62 . 089 044 
B -0.82 — 0.17 0.66 0.91 0.46 
c -0.75 .-0.79 — 0.64 1.00 0.50 
Node 1 -0.78 ~0.76 -0.82 u-— 1.92 0.96 
Distance C to node 2 = 0.64/2 + (0.50 — 0.96)/2 = 0.09 
Distance node 1 to node 2 = 0.64 — 0.09 = 0.55 ; 
A B Node 2 r Ul oh 
A — 008 0:08 0.16 0.16 
B —0.26 — 0.10 0.18 0.18 
Node2  -026 -026 >` — 0.18 : 0.18 
Distance A to node 3 = 0.08/2 + (0.16 — 0.18)/2 = 0.03 
Distance node 2 to node 3 = 0.08 - 0.03 = 0.05 ` 
B Node 3 
B — 0.05 
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where for the s pairwise comparisons among OTUS, I and O are input dis- 
tances from the matrix and output distances from the tree, respectively. 
Smaller values of F indicate better fit (although not necessarily the correct 
tree). Other such measures include a co-phenetic correlation (Sneath and 
Sokal 1973) and percent standard deviation (Fitch and Margoliash 1967). As 
expected, procedures such as neighbor-joining that explicitly adjust tree 
branches to improve fit to an additive tree generally outperform methods 
such as UPGMA that do not (Avise et al. 1980; Berlocher 1981; Prager and 
Wilson 1978). 

A second basis for choice among distance-based tree-building algo- 
rithms involves the degree of congruence among trees derived from differ- 
ent data sets. Because a given array of species presumably has a single gen- 
eral phylogenetic typology along which all characters have evolved 
(although not necessarily through the exact same transmission routes), 
methods of data analysis producing more highly congruent trees might be 
judged superior (Farris 1971). Several measures for evaluating levels of 
congruence among trees have been suggested (e.g., Farris 1973; Mickevich 
1978; Swofford et al. 1996). 

Another approach for comparing the performance of distance-based or 
other phylogenetic algorithms involves computer simulations of molecular 
change along trees generated under some specified model of evolution. 
Genetic distances among extant computer OTUs are estimated, and each 
algorithm's performance is evaluated by how well it recovers the known 
tree (Fiala and Sokal 1985; Jin and Nei 1991; Saitou and Nei 1987). Potential 
difficulties of this approach lie in assessing the biological plausibility of the 
model's assumptions and in the risk of circular reasoning when the best 
phylogenetic algorithm proves to be the one whose assumptions most close- 
ly match those underlying the simulation. All phylogenetic algorithms 
involve assumptions (transparent or opaque). 

A powerful method for evaluating algorithm performance was intro- 
duced by Hillis et al. (1992). They serially propagated bacteriophage T7 in the 
presence of a mutagen, experimentally dividing the culture at various time 
intervals, such that a known phylogeny was produced. Terminal lineages 
were then assayed for restriction site maps, and the data were used to infer 
evolutionary history by various phylogenetic methods. All five algorithms 
employed, which included N-J and UPGMA as well as a qualitative parsimo- 
ny approach, produced the correct branching order of the known topology, 
but they differed slightly in ability to recover correct branch lengths. Of 
course, such direct appraisals of phylogenetic methods using living 
organisms are logistically possible in only a few systems, such as T7, in which 
mutation rates are high and thousands of generations transpire each year. 

UPGMA automatically produces a rooted “tree.” To root an N-J tree or 
another such distance network, two procedures may be followed. First, one or 
more outgroup taxa (see Box 4.1) can be included in the analysis; in which 
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case the root is placed between an outgroup and the node leading to ingroup 
members. Alternatively, if an approximate uniform rate of evolution is 
assumed for long time periods, the network may be rooted at the midpoint of 
the longest pathway between any extant OTUs (as was done in Figure 4.7C). 


Character-state approaches 


For most kinds of molecular markers, including DNA sequences, discrete 
character states permit phylogenetic analyses to be performed directly on 
the qualitative raw data, if so desired, without the requirement of first con- 
structing a distance matrix. Several such approaches are available. 


MAXIMUM PARSIMONY. A maximum parsimony (MP) tree is one that 
requires the smallest number of evolutionary changes to explain observed 
differences among OTUs. Consider Figure 4.8, which presents a parsimony 
network for ten extant OTUs based on the depicted states of nine variable 
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Figure 4,8 Estimate of an unrooted parsimony network (right) based on an 
OTU x character matrix (left) for a subset of mtDNA clones observed in green tur- 
tles. Lowercase letters represent character states (mtDNA digestion profiles pro- 
duced by nine restriction enzymes; adjacent letters of the alphabet denote profiles 
that differed by a single restriction site). Inferred restriction site changes along 
branches of the network are indicated. Asterisks indicate probable instances of 
homoplasy. (After Bowen et al. 1992.) 
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characters (mtDNA restriction site patterns in this case). Construction of the 
network from this simple data matrix can be done by hand. The process is 
initiated by connecting any OTU to its nearest genetic neighbors via refer- 
ence to the character-state matrix. For example, haplotype D is one mutation 
step removed from each of three other haplotypes (A, C, and E), which are 
two steps removed from one another. Haplotype B, in turn, is one step from 
A, two steps from D, arid three steps each from C and E. Thus, the distribu- 
tion of character states for OTUs A-E yields the singular most parsimonious 
network shown at the bottom right of Figure 4.8. Note that this portion of 
the network is strictly additive. 

Generation of the complete network for all ten extant OTUs illustrates 
some complications that can arise. First, not all nearest-neighbor OTUs are 
a single mutation step apart, so in this case two hypothetical OTUs (HYP1 
and HYP2) were arbitrarily added to make all branches in the upper portion 
of the tree of unit length. Second, there is a large genetic gap distinguishing 
OTUs A-E from the assemblage H, I, J, L, M, so where the branch connect- 
ing these groups should be placed is not initially obvious. Here, D and 
HYP2 are joined because they differ by five steps, whereas any other inter- 
group branches would involve six steps or more. Third, some genetic char- 
acter states appear in different (presumably distantly related) portions of the 
network. For example, the character state Stul-d appears in some represen- 
tatives of both the upper and lower OTU groups, probably due to poly- 
phyletic origins from Stul-c. Similarly, the state EcoRI-e appears in OTUs I 
and J, which are not adjacent genetically as judged by the other assayed 
characters. Such character states contribute to homoplasy (indicated by 
asterisks) by introducing additional steps along network branches beyond 
those that differentiate OTUs in the original character-state matrix. 
Nonetheless, in this example, the sum of all pairwise output distances in the 
network (256) is only slightly greater that the sum of all input distances 
(250), indicating strong goodness of fit between network and data. 

In usual practice, data are analyzed by computer programs that search 
vast numbers of alternative trees for minimum total length. PAUP* (phylo- 
genetic analysis using parsimony) by Swofford (2000), and PHYLIP (phy- 
logeny inference package) by Felsenstein (available at evolution.gene- 
tics.washington.edu/phylip.html) have been among the industry standards 
for such analyses. Sometimes, many MP trees of different topology prove 
equally parsimonious or require similar numbers of steps. Nonetheless, 
such networks constitute only a small fraction of the vast universe of poten- 
tial trees, most of which require many more steps and can therefore be elim- 
inated from further serious consideration. 

Actually, parsimony approaches constitute a large family of related 
methods incorporating varying assumptions about how character-state trans- 
formations occur (Swofford et al. 1996). Under Wagner parsimony (Farris 
1970; Fitch 1971; Kluge and Farris 1969), for example, free evolutionary 
reversibility of character states is allowed, with changes in either direction 
equally likely. By contrast, Dollo parsimony (Farris 1977) assumes that each 
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non-ancestral character is uniquely derived (although multiple reversions to 
the ancestral condition are allowed), and Camin-Sokal (1965) parsimony 
assumes that all evolutionary change is irreversible. Ideally, the assumptions 
employed in a particular analysis should match the true nature of evolution- 
ary change in the molecular markers utilized. For example, Wagner parsimo- 
ny might be appropriate for analyzing data that differ by transitional substi- 
tutions that interconvert readily; Camin-Sokal parsimony might work well 
for SINEs, which uniquely insert into genomes but once present are usually 
retained indefinitely; and Dollo parsimony might be suitable for some restric- 
tion data in which each particular site loss is mechanistically more likely by 
severalfold than its gain (DeBry and Slade 1985; Templeton 1983, 1987). 
Caution is always indicated, however, so it is probably best to attempt and 
compare multiple approaches, including distance-based analyses, before 
drawing firm biological conclusions from any reconstructed phylogenies. 


MAXIMUM LIKELIHOOD. Maximum likelihood (ML) covers a large class of 
procedures united by the principle that reconstructed trees should maximize 
the probability of observing an available data set under a particular model of 
evolutionary change. Of course, there are countless different models of mole- 
cular evolutionary change, but the idea is to choose one that is plausible 
based on data-informed or theory-informed knowledge about the system. A 
simple model might assume that the rate of nucleotide substitution is the 
same between all nucleotide pairs, that the substitution rate is the same 
throughout the tree, and that the expected number of substitutions along each 
tree branch is a function of this substitution rate and the length of the branch. 
A computer-based search then seeks to identify a tree that, under those 
assumptions, is most likely to explain the data that initiated the search. 

Advantages of ML are that it allows the user to specify an evolutionary 
model and that it usually identifies a single ML tree (although not necessar- 
ily one that is statistically or biologically better than alternatives that are 
close, or sometimes even grossly different, in structure). A practical disad- 
vantage is that ML is considerably slower than parsimony and distance- 
based approaches and can easily exceed the capacity of even high-speed 
desktop computers. The computer programs DNAML (introduced by 
Felsenstein 1981a) and TREE-PUZZLE (Strimmer and von Haeseler 1996) 
implement maximum likelihood procedures for molecular sequence data, as 
does PAUP* (Swofford 2000). 


BAYESIAN ANALYSIS. The hottest new approach in phylogenetics was made 
practicable by the computer program MrBayes (Huelsenbeck 2000), which 
again requires that the user postulate a model of evolution. Bayesian analy- 
sis is actually a variant of likelihood methods, but instead of seeking a sin- 
gle tree with the greatest likelihood of observing the data, it produces best 
sets of trees and entire probability distributions of likelihoods given the data 
and the evolutionary model specified (Rannala and Yang 1996). Like ML, it 
does so by searching a landscape of multitudinous possible trees, moving 
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from point to point on the likelihood surface in pursuit of higher vantages 
(i.e., more likely trees). But unlike ML, which can get trapped on a local hill 
instead of the globally highest mountains, the algorithm in MrBayes (a 
Markov chain Monte Carlo process; Mau et al. 1999) allows the search to 
leap valleys and thereby gain not only a better perspective on the full likeli- 
hood landscape, but also a better opportunity to ascend its highest peaks. 


Conclusions about phylogenetic procedures 


The plethora of data analysis methods in molecular phylogenetics can be 
confusing, but it is important to remember a few major points. First, it is 
always desirable to attempt a match between the assumptions of a phylo- 
genetic procedure and the nature of evolution in the molecular characters 
assayed. Second, where possible, it is often advisable to include both a dis- 
tance-based and a character-based approach in an analysis for comparison. 
Distance-based methods are relatively straightforward and often provide a 
simple overview. Character-based approaches can be more information-rich 
and testable (sensu Popper 1968) because they provide hypotheses about 
character-state distributions along tree branches (Avise 1983; Baverstock et 
al. 1979; Patton and Avise 1983). Consider, for example, Figure 4.8, in which 
hypothesized character-state changes are explicitly summarized as an inher- 
ent aspect of the reconstruction process (such detail is impossible in distance 
analyses alone). Furthermore, if other evidence suggests that portions of the 
tree topology are suspect , the characters responsible for the difficulty can be 
identified. Perhaps they were scored incorrectly, or perhaps their states were 
polyphyletic, in which case they might be subjected to further analysis to 
assess the molecular basis of the apparent homoplasy. 

A third general point is that outgroup taxa should be included in phylo- 
genetic analyses because they facilitate tree rooting and help to establish char- 
acter-state polarities. Fourth, it is important to include confidence statements 
about putative clades. The most common approach is bootstrapping 
(Felsenstein 1985a; Hedges 1992), which involves three steps: sample (with 
replacement) from the existing data; generate a new tree from the re-sampled 
data; and then repeat the process perhaps hundreds of times to assess how 
frequently particular groups or clades appear in these pseudo-replicate trees. 
Thus, bootstrapping indicates how well putative clades in a tree are support- 
ed by the existing data (although not necessarily how well the available data 
represent genes not assayed). Interestingly, bootstrapping is unnecessary 
under Bayesian analyses because the probability of a given clade is already 
evident from its frequency in the set of Bayesian trees with high likelihood. 

Fifth, the brouhaha about finding the "best" tree is often somewhat 
moot, because even less than perfect phylogenetic reconstructions still usu- 
ally capture the major features of a tree that may be of primary biological 
interest. For example, a common observation in phylogeographic surveys 
(see Chapter 6) is that regional sets of populations are tightly allied genealog- 
ically, yet are separated from one another by pronounced phylogenetic 





Philosophies and Methods of Molecular Data Analysis 143 


breaks. These gaps, which are often the historical footprints of Pliocene or 
Pleistocene separation events, are evident regardless of the method used to 
reconstruct the phylogenetic trees. 

One final point about phylogenetic reconstruction will provide a segue 
into the next section, on gene trees. All phylogenetic methods discussed above 
are based on the assumption that the characters analyzed are "independent," 
but this concept warrants elaboration. Characters are independent in a mech- 
anistic sense if changes in one character occur independently of those in 
another, such that their states do not covary because of pleiotropic effects in 
the underlying mutational process. For example, appearances and disappear- 
ances of restriction fragments tend to covary across digestion profiles, where- 
as the responsible restriction site changes do not; for this reason, it is prefer- 
able to code RFLP data as presence versus absence of particular restriction 
sites rather than fragments (or at least to accommodate the covariance of frag- 
ments in the phylogenetic analysis). The assumption of independence, critical 
to most computational algorithms, is probably valid for most molecular char- 
acters (unlike the situation for many morphological traits) and, indeed, is a 
major strength of multi-character molecular approaches. 

However, there is another sense in which molecular characters may be 
partially non-independent in evolution. When molecular characters are 
tightly linked, molecular states tend to covary in transmission across organ- 
ismal generations. Such is the case for nearby nucleotides in a nuclear gene, 
or for any and all nucleotides in a non-recombining organelle genome. Thus, 
although such character states are independent in the mechanistic sense of 
mutational origins, they are not independent with regard to genealogy. 
Recognition of this fact led to the important distinction between a "gene 
tree" and a "species tree" (Avise 1989a; Doyle 1992; Neigel and Avise 1986; 
Nichols 2001; Pamilo and Nei 1988; Tajima 1983; Tateno et al. 1982; Wilson 
et al. 1985). 


Gene Trees versus Species Trees 


When OTUs are multiple alleles of a complex locus (eg. haplotype 
sequences in mtDNA genomes), a reconstructed phylogeny represents a 
gene tree. Any group of organisms has a single true pedigree that extends 
back through time as an unbroken chain of parent-offspring genetic trans- 
mission, but due to the non-deterministic nature of biparental Mendelian 
heredity in sexually reproducing species, not all genes will have trickled 
through this organismal pedigree in identical fashion. Thus, gene trees 
inevitably differ somewhat in topology from one unlinked locus to the next 
(Ball et al. 1990), both within and between related species. Thus, even in the 
absence of introgression (see Chapter 7) or horizontal gene transfer (see 
Chapter 8), a given gene tree within any set of species will differ in structure 
from others, as well as from the consensus population tree or species tree of 
which it is a part. These differences result from “lineage sorting” processes 
that may be exemplified as follows. 
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Consider a single population pedigree through which haplotypes have 
descended. A simple conceptual case involves mtDNA inherited through 
female lines, but the principles apply to any non-recombined haplotypes of 
particular nuclear genes as well. As shown in Figure 4.9, some females, by 
chance, leave no daughters (their mtDNA lineages terminate), whereas oth- 
ers produce one or more daughters that may, in turn, contribute mtDNA to 
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Figure 4.9 The allelic lineage sorting process within a population. Shown is an 
mtDNA gene tree through 20 generations. Each node represents an individual 
female, and branches lead to daughters. The tree was generated by assuming a 
Poisson distribution of progeny numbers with a mean of one daughter per female. 
(After Avise 1987.) 
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| successive generations. Thus, as an inevitable consequence of differential 
| organismal reproduction, any gene tree (cytoplasmic or nuclear) continually 

self-prunes: some branches are lost as others proliferate. At equilibrium, the 

expected frequency distribution of times to common ancestry can be approx- 

imated if the population demography is specified (see Box 2.3). In general, 
| for relatively stable populations, it is unlikely that two or more founding lin- 
eages will survive beyond 4N generations, where N is the population size 
| (Figure 4.10). In Chapter 6, more will be said about lineage sorting and relat- 
| ed concepts of coalescent theory as they relate to population size. 

Now consider the lineage sorting process extended to two daughter (or 
sister) taxa, A and B, that stem from the same ancestral population. With 
regard to a gene tree within these sister populations or species, three cate- 
gories of phylogenetic outcomes are possible (Figure 4.11): (I) reciprocal mono- 
phyly, in which all alleles within each sister taxon are genealogically closer to 
one another than to any heterospecific alleles; (II) polyphyly, wherein some 
alleles in each taxon are genealogically closer to heterospecific alleles than to 
homospecific alleles; and (III) paraphyly, in which all alleles within one 
daughter taxon are one another's closest relatives, but some alleles in the sec- 
ond taxon are genealogically closer to heterospecific alleles. Category 1 in 
Figure 4.11 also illustrates how the depths of gene trees can vary even when 
their branching topologies agree with the species tree (e.g., the extant alleles 
in B trace to an ancient node "b," whereas the alleles in A trace to a recent 
node "a"), Categories II and IIl in Figure 4.11 illustrate how a gene tree can 

| differ in fundamental branching order from a species tree. These discor- 
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Figure 4.10 Probabilities (x) of survival of two or more founding lineages 
through time. Shown are probability curves for populations of various size (N) in 


which females produce daughters according to a Poisson distribution with mean 
1.0. (After Avise et al. 1984a.} 
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Figure 4.11 Three categories of phylogenetic relationships between two sister 
taxa (A and B) are possible with respect to an allelic genealogy. Lowercase letters 
point out important ancestral nodes to which extant alleles or haplotypes trace. 
Solid dark bars indicate barriers to reproduction (extrinsic or intrinsic). The phylo- 
genetic categories in the gene tree are as follows: I, reciprocal monophyly; II, poly- 
phyly; M, paraphyly of A with respect to B. (After Avise et al. 1983.) 


dances arise because many allelic separations predate the species split 
(unless the ancestral form went through an extreme population bottleneck 
just prior to speciation). Figure 4.12 shows diagrammatically how these three 
categories of phylogenetic relationships can characterize the same pair of sis- 
ter taxa at different times following their separation. This occurs because lin- 
eage sorting typically converts any initial condition of genealogical polyphy- 
ly to one of paraphyly and eventually to one of reciprocal monophyly. 

If evolutionary genetic distances among haplotypes are measured in 
units of time since common ancestry, these three categories of phylogenetic 
relationships between sister taxa (with respect to a gene tree) may be defined 
by the formal inequalities in Table 4.2. Neigel and Avise (1986; see also 
Hudson and Coyne 2002; Rosenberg 2003) employed replicated computer 
simulations to monitor the probabilities of each phylogenetic status of sister 
taxa with respect to mtDNA lineages (Figure 4.13). Shortly after their separa- 
tion, it is highly likely that sister taxa will exhibit a polyphyletic gene tree sta- 
tus. At intermediate times since speciation (typically N - 3N generations, 
where N is the population size of each sister taxon), probabilities of polyphy- 
ly, paraphyly, and monophyly are intermediate as well. Only after about 4N 
generations do sister taxa finally have a high probability of becoming recip- 
rocally monophyletic. Similar results apply to nuclear genes (Nei 1987), 
although times to monophyly are extended accordingly because of the expect- 
ed fourfold larger effective population sizes for nuclear loci (see Box 2.3). 

Such models assume selective neutrality at the locus per se (plus some 
specified set of population demographic conditions), but dramatically differ- 
ent outcomes can arise if strong selection occurs at or near the locus under 
examination. For example, when balancing selection maintains haplotype 
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Figure 4.12 The lineage sorting process extended to two sister taxa. Shown are 
distributions of allelic lineages at a single gene through an ancestral population 
subdivided at time X into two daughter populations or species. With respect to this 
gene tree, these sister species are polyphyletic between times X and Y, are para- 
phyletic between times Y and Z, and are reciprocally monophyletic beyond time Z. 
(After Avise and Ball 1990.) 


polymorphisms within a species, expected times to reciprocal monophyly in 
a gene tree can be much longer than those expected under a neutral model 
because lineage sorting at that locus is effectively inhibited. Beginning with 
several studies published in the late 1980s on two such balanced polymor- 
phisms—involving major histocompatibility loci in mammals (Figueroa et al. 
1988; Lawlor et al. 1988; McConnell et al. 1988; Takahata and Nei 1990) and a 
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Generations 
Figure 4.13 Probabilities of reciprocal monophyly (D, polyphyly (ID, and para- 
phyly (III) for two sister taxa at indicated numbers of generations following a 
simulated speciation. In each of 100 replicate computer runs, daughter species 
were founded by 20 and 30 individuals respectively and allowed to grow rapidly : 
to carrying capacity N = 200. (After Neigel and Avise 1986.) ] 


self-incompatibility locus in plants (loerger et al. 1990)—it soon became 
apparent that some such balanced polymorphisms can persist for millions of 
years and be maintained across sequential speciation events. 

Discordances between species splitting patterns and the topologies of 
gene trees can also characterize taxa that separated anciently, but whose spe- 
ciations occurred close together in time (Figure 4.14). The same kinds of lin- 
eage sorting processes are responsible: The lineages from the polymorphic 
ancestral gene pool that happen to have reached fixation in the descendant 
taxa may, by chance, be those that produce a gene tree/species tree discor- 
dance (Takahata 1989; Wu 1991). Such discordances can create problems for 
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category Phylogenetic status Distance relationship? 
I A and B monophyletic max daa < min dap and 
max dps < min dag 
ju A and B polyphyletic max d, , > min d,, and 
max dg, > min day 
Ma A paraphyletic with respect to B max d, , > min d,g and ri 
max dg, < min dag 3d 
Ib B paraphyletic with respect to A max d,a < min dpp and 3 


max dg > min d, 


Source: Neigel and Avise 1986. 

^ Maximum evolutionary distances within either taxon (max d, , or max dpp) versus mini- x 
mum distance between sister taxa (min d,,) are the deciding criteria (assuming that these 1 
genealogical distances are linearly related to time; see Figure 4.1] and text). 





Philosophies and Methods of Molecular Data Analysis 149 


phylogeny estimation at the species or population level. For example, 
intense debate about the relationships of humans, chimpanzees, and goril- 
las has centered on which phylogenetic tree is "true": humans and chimps 
as sister taxa, or perhaps chimps and gorillas, or perhaps humans and. goril- 
las. The branching pattern in this related triad of species clearly consists of 
two closely spaced nodes like those in Figure 4.14. Many molecular assays 
have been applied to this question, but not all genetic data have yielded 
exactly the same phylogenetic outcome (see Chapter 8). From the theory 
outlined above concerning quasi-independent gene trees and the idiosyn- 
crasies of lineage sorting across closely spaced phylogenetic nodes, perhaps 
no single outcome should be expected. 

These perspectives stemming from molecular research reveal several 
points of qualitative importance to phylogenetics, beyond the immediate fact 
that gene trees and species trees can differ in branching topology. First, with 
regard to haplotype relationships, the phylogenetic status of a given pair of 
species is itself an evolutionarily dynamic characteristic, with a usual time 
course subsequent to speciation being polyphyly or paraphyly preceding 
reciprocal monophyly (see Figures 4.12, 4.13). Second, the phylogenetic 





Figure 4.14 Gene genealogies within a species phylogeny. Shown are two topo- 
logically distinct gene trees (thin dark lines) possible within a species phylogeny 
(broader shaded branches) that consists of two sister taxa and an outgroup. In (A), 
the gene tree and the species tree have the same branching pattern, whereas in (B) 
the branching topologies differ. For neutral alleles, the probability of the discor- 
dance exemplified in (B) is given by 2/3eT/2Ne (Nei 1987), where t, is the time of 
the first speciation, t, is the time of the second speciation, T = f, - fj, and N, is the 
effective population size. 
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status of species is a function both of the pattern of population splitting and 
of historical demography within the populations involved (Avise et al. 
1984a). For example, sister species with large effective population sizes (see 
Box 2.2) will tend to retain a polyphyletic or paraphyletic status for longer 
times than will species with small N,, all else being equal. Third, in 
accounting for the appearance of “heterospecific” alleles within a given 
species, it now is clear that possibilities involving lineage sorting from an 
ancestral gene pool must be considered (in addition to the usual scenarios 
of interspecies transfer mediated by introgressive hybridization; see 
Chapter 7). 


BOX 4.3 Isolation of DNA Haplotypes j 


The table enumerates some special genetic systems and approaches for the isola- 1 
tion of DNA haplotypes. When such systems also exhibit high genetic variation d id 


Approach Rationale 


Most individuals effectively 3 i . 
homaplasmic and haploid; se 
non-recombinational heredity 


Organelle genomes 


Sex chromosomes Heterogametic sex is haploid; limited 
(e.g., X or Y in mammals, recombination, particularly in the i 
Z or W in birds) sex-specific Y or W chromosomes *H 


Species with haplo-diploid 
sex determination 


Species with prominent haploid 
phase of life cycle, or of a 
particular tissue 


Haploid species 


Selfing diploid species 





Males are haploid in many 
hymenopteran insects 


Gametophyte stage of mosses, for 


example, is haploid; endosperm in 
seeds of gymnosperms is a haploid 
product (gametophyte) from th 
female parent 


Haploid microorganisms should be 
suitable, provided that sexual 
reproduction and recombination 
are limited 


Highly inbred strains (natural or 
artificial) usually carry identical-by- 
descent alleles 
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Because of its rapid pace of evolution, its “haploid” packaging within 
most organisms, and its non-recombining mode of transmission, mtDNA has 
provided the vast majority of empirical data suitable for estimating gene 
trees over microevolutionary time scales (see Chapter 6). In principle, data 
from nuclear loci could be exploited similarly, and some successful empirical 
examples do exist (Antunes et al. 2002). In general, however, three major 
practical difficulties arise. First is the technical problem of isolating individ- 
ual haplotypes of a nuclear locus from diploid organisms. Box 4.3 describes 
several genetic systems and experimental approaches that might circumvent 
this difficulty. A second potential complication, especially at the intraspecific 





but little or no inter-allelic recombination, they provide suitable potential opportu- 
nities for construction of gene trees. 


Comments Early (or otherwise key) reference 


By far the most widespread source Avise 1989a 
for gene-tree data 


Relatively few genes identified or Bishop et al. 1985; Hurles and 
surveyed so far Jobling 2001; Vulliamy et al. 
1991; see Chapters 6 and 7 


Not yet widely capitalized upon Hall 1990 
for full gene trees 


Not yet widely capitalized upon McDermott et al. 1989 
for full gene trees; does not apply 
to triploid endopserms of 


angiosperms 
Relatively few attempts Nelson et al, 1991 
Often used to advantage in Stahl et al. 1999 


Arabidopsis, for example 


(continued) 
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Approach Rationale 





PCR amplification in vitro Gametes are haploid; each molecule 
from single sperm or egg, represents one haplotype 
or single DNA molecule 


DNA amplification in vivo Cloning passes DNA through a 
(in biological vectors) bottleneck of one molecule 
Allele-specific PCR amplification If primers can be identified as 
and related approaches allele-specific, they can be used to 
amplify haplotypes even from 
heterozygous loci 
SSCP and DGGE gels Physical separation of haplotypes at a 
locus following PCR amplification 
HAPSTR and SNPSTR assays Another method to physically separate 


nuclear haplotypes, in this case in 
regions surrounding STR loci or 


specific SNPs 
Extraction of individual Use of inbred strains, or of controlled 
chromosomes crosses producing individuals with 


chromosomes identical by descent; 
especially powerful when applied to 
genes within chromosome inversion 
systems where recombination is 
limited or absent 


level, is intragenic recombination, which over time in a population can mix 
and match various pieces of otherwise separate haplotypes at a nuclear 
locus. A third complication is gene conversion, wherein particular DNA 
sequences in effect are converted to those of another allele, or even to those 
of another locus in the same gene family (e.g., Popadic and Anderson 1995; 
Popadic et al. 1995). 

Any shuffling of genetic material among alleles by intragenic recombi- 
nation or gene conversion, if frequent over time frames relevant to a 
genealogical reconstruction, will obscure the otherwise linear evolutionary 
histories of particular haplotypes within a species (Hudson 1990). Although 


A 
* 
Fj 

K 
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Comments 


Technical hurdles high, but 
accomplished successfully in 
humans and a few other species 


Highly laborious; concerns about 
PCR misincorporation 


Potentially of wide nuclear 
applicability, but primer design 
is labor-intensive 


Potentially useful, but seldom 
employed to date in a gene-tree 
context 


Still mostly in development 


Methods most readily available 
in Drosophila 


Early (or otherwise key) reference 


Boehnke et al. 1989; Grewal et 
al. 1999; H. Li et al. 1988; 
Lien et al. 1999; Navidi and 
Arnheim 1999; Ruano et al. 
1990; Stephens et al. 1990; 
Sun et al. 1995 


Scharf et al. 1986 


Fullerton et al. 1994; Harding 
et al. 1997; Newton et al. 1989 


See section on SSCPs in 
Chapter 3 
See relevant section in 


Chapter 3 : 


Aquadro et al. 1986, 1991 


statistical analyses of patterns of nonrandom association (disequilibrium) 
among tightly linked polymorphic markers can help to reveal whether 
recombination has been frequent in the history of a gene region (Clark 1990; 
Crandall and Templeton 1999; Kuhner et al. 2000; Stephens and Nei 1985; 
Stephens et al. 2001a,b), these and related methods (e.g., McGuire et al. 1997) 
usually encounter serious limitations in reconstructing haplotypes when 
recombination has been other than rare (Posada 2002; but see also Maynard 
Smith and Smith 1998). Thus, if the intent is to recover a non-anastomose 
gene tree, attention must normally be confined to DNA segments with little 


or no recombination (Box 4.4). 





154 Chapter 4 


BOX 4.4 Intraspecific Gene Trees for Nuclear Loci 





Most early attempts to study nuclear gene trees at the intraspecific level 
involved Drosophila species because experimental crosses could be conducted to 
“extract” individual chromosomes from wild diploid fruit flies. Each use of this 
breeding procedure resulted in a pure strain whose members were “identical by 
descent” for a particular chromosomal haplotype (see Box 4.3). Such haplotypes 
could then be assayed for DNA sequences at particular loci, and the resulting 
data could be used in attempts to reconstruct allelic genealogies (haplotype trees 
or gene trees). 

One classic application of this approach resulted in a successful estimate 
of a gene tree for Adh (alcohol dehydrogenase) in D. melanogaster, as summa- 
rized in Figure A. Any clean gene tree reconstruction of this sort requires that 
the haplotypes involved have had a history of little or no recombination over 
the time scales of the phylogeny. When such stretches of DNA with limited 
internal recombination can be identified, the gene genealogies that they imply 
can also be used to map phenotypes associated with that chromosomal region 
(Templeton et al. 1992). An example involving phylogenetic placements of the 
“fast” (F) and “slow” (S) protein electromorphs at Adh is shown in Figure A. 
The rationale for this type of endeavor is that any phenotypes dictated by alle- 
les within the non-recombining marker region would be embedded in the 
same evolutionary history: that is reflected in the gene tree. 

However, it has proved difficult to predict which gene regions are likely to 
be sufficiently free of inter-allelic recombination to permit gene tree reconstruc- 
tions. For example, similar attempts to construct an intraspecific allelic phyloge- 
ny for Adh in D. pseudoobscura were mostly thwarted by an absence of strong 
nonrandom associations among linked restriction sites, apparently due to a his- 
tory of more frequent inter-allelic exchange (Schaeffer and Miller 1992; Schaeffer 
et al. 1987; but see also Schaeffer et al. 2001). The xanthine dehydrogenase region 
of D, pseudoobscura also showed few nonrandom associations among restriction 
sites (Riley et al. 1989), and the same proved true for several other loci (e.g., 
notch, white, zeste-tko, and perhaps amylase) in D. rnelanogaster (Aguadé et al. 
1989a; Langley and Aquadro 1987; Langley et al. 1988; Schaeffer et al. 1988). 

Aquadro et al..(1991) capitalized on the recombination suppression prop- — , 
erties of chromosomal inversions in D. pseudoobscura-to generate a phylogeny 
for an amylase gene (Amy) that is contained within the inverted region of the 
third chromosome. The reduction in effective recombination in inversion het- 
erozygotes is dramatic arid occurs because crossing over inside the inverted 
region normally produces dysfunctional duplication and deficiency products 
that are shunted to polar bodies, where they fail to participate in zygote forma- 
tion (also, Drosophila males lack recombination). The biological significance of 
recombination suppression in this system is that it facilitates the maintenance 
of linked and apparently adaptive epistatic complexes of genes within the 
inverted regions (Schaeffer et al. 2003). A gene tree based on restriction site 
maps for 28 Amy haplotypes is presented in Figure B; the karyotypic parsimo- 
ny network is shown in Figure C. 

The following conclusions emerged from comparisons of this tree and net- 
work (Aquadro et al. 1991): restriction site differences are greater among gene 
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(A) Classic example of an intraspecific nuclear gene tree involving the alcohol dehydrogenase (Adh) gene In 
Drosophila melanogaster. Shown is a parsimany network for 18 haplotypes (in ovals) as identified by 15 restric- 
tion sites and other DNA sequence characters (coded from left to right in the 5’ — 3' direction). This case alsa 
illustrates how inter-allelic recombination at a nuclear focus can sometimes be deduced from haplotype data. 
Note from the black and gray bars that the 5’ end of a probable recombinant haplotype appears to stem from 
one portion of the parsimony network and the 3' end from the other. From this phylogeny and additional 
genetic evidence, it was also surmised that the "fast" allozyme allele (F) probably evolved recently from a 
^slow" (S) ancestral allele (Aquadro et al, 1986; Ashburner et al. 1979; Stephens and Nei 1985). (From Avise 
2000a, using data from a broader genealogy presented in Aquadcto et al. 1986.) 
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(B, C) Molecular and karyotypic phylogenies in Drosophila pseudoobscura. (B) Gene tree for 28 
molecular haplotypes observed at the amyjase (Amy) locus. Also indicated are the chromoso- 
fnal inversion types from which these Amy haplatypes were extracted: ST (Standard); AR 

- (Arrowhead); KL (Klamath, from D, persimilis}; CH (Chiricahua); SC (Santa Cruz); and TL 
(Treeline). (After Aquadro et al. 1991.) (C) Cytogenetic phylogeny of these same e gene 
arrangements. 


arrangements than among haplotypes within the same gene arrangement; the 
“gene phylogeny based on Amy is generally concordant with the inversion phy- 
' Jogeny as gauged by gross karyotype; and from application of a molecular 
‘clock, the inversion polymorphism is old (perhaps about 2 million years). A 
follow-up study utilized direct DNA sequence data at Amy to deduce that ST 
could be excluded as the ancestral chromosomal condition, TL could not, and 
. SC appeared to be the most likely ancestral arrangement (Popadic and 
“Afiderson-1994). . 


vau desti 
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Outside such inverted regions (as well as within them; Navarro et al. 1997, 
2000), many demographic as well as molecular factors can influence the history 
of effective recombination within a gene. For example, haplotype contents of 
species that are spatially subdivided should tend to exhibit greater linkage dis- 
equilibrium (nonrandom associations) than those in high-gene-flow species in 
which alleles are routinely brought together such that recombination among 
them is at least possible (Baum and Shaw 1995). Different chromosomal 
regions also are known to differ inherently in recombination rates, one conse- 
quence again being different patterns of linkage disequilibrium between mark- 
ers (Aguadé and Langley 1994; Aguadé et al. 1989b; Aquadro et al. 1994; Begun 
and Aquadro 1992). For these and additional reasons, different genes within 
the same organismal phylogeny can and often do show very different patterns 
of genetic variation and genealogical relationships (e.g., Hasson and Eanes 
1996; Machado et al. 2002). 

Unfortunately, the technical and biological challenges of estimating auto- 
somal gene trees at the intraspecific level (such as those in the accompanying 
figures) have seldom been overcome in sexual diploid species, so relatively few 
good examples exist in the current literature (see Chapters 6 and 7). 


Gene trees and species trees are equally "real" phenomena, merely 
reflecting different aspects of the same phylogenetic process. Thus, occa- 
sional discrepancies between the two need not be viewed with consterna- 
tion as sources of "error" in phylogeny estimation. When a species tree is of 
primary interest, gene trees can assist in understanding the population 
demographies underlying the speciation process, as well as the species 
splitting patterns themselves (see Chapters 6, 7, and 8). Of course, for such 
purposes it would be desirable to include information from multiple gene 
genealogies. Each gene tree also is of inherent interest because it describes 
the evolutionary history of genetic changes within a localized bit of the 
genome. In studies of the ages and origins of specific genetic adaptations 
(or disorders), for example, such single-locus reconstructions can become 
the primary foci of attention (see Box 4.4). 


SUMMARY 


1. Population genetic theory and phylogenetic theory are highly germane to 
analyses and interpretations of molecular data. Many methods of data analy- 
sis (described throughout this book) are idiosyncratic to particular research 
questions, but this chapter outlines some major principles and methods for 
estimating phylogenetic trees within and among species. In rough order of 
historical appearance, these include phenetic methods of numerical taxono- 
my, Hennigian cladistics as applied to small numbers of discrete characters 
with alternative states, various forms of parsimony analysis (a logical out- 
growth of cladistics) as applied to more complicated data sets, maximum like- 
lihood methods, and Bayesian approaches (the latest and most powerful 
approach in phylogenetic analyses of DNA sequences). 


: 
» 
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2. Empirical studies have documented substantial variation of several kinds in 
molecular evolutionary rates: across nucleotide positions within a codon; 
among non-homologous genes within a lineage; among classes of DNA within 
a genome; among different genomes (nuclear versus organelle) within an 
organismal lineage; and among organismal lineages with respect to particular 
classes of homologous genes and characters. Nonetheless, a general time- 
dependent nature of molecular evolution is also evident. The concept of 
molecular clocks has played a major role in molecular phylogenetics. Both 
absolute and relative rate tests have been widely employed in the assessment 
of molecular evolutionary tempos in different taxa. 


3. Important distinctions exist between a gene tree and a species tree. These two 
aspects of genealogy provide different but mutually informative phylogenetic 
perspectives on evolutionary processes. 
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Individuality and Parentage 





With the recognition ... that the ... genome is replete with DNA sequence 
polymorphisms such as RFLP's, it was only a small leap to imagine that 
DNA could, in principle, provide the ultimate identifier. 

E. S. Lander (1991) 


Most species of sexually reproducing organisms harbor sufficient genetic varia- 
tion that appropriate molecular assays can distinguish each individual from all 
others with near certainty. Furthermore, polymorphic molecular markers with 
known pathways of hereditary transmission afford powerful opportunities to 
identify parent-offspring links. Issues of genetic identity versus non-identity and 
of biological parentage (genetic maternity and paternity) fall at the extreme 
microevolutionary end of the genealogical continuum. Especially suitable for 
such analyses are highly variable nuclear genome markers with specifiable 
modes of Mendelian inheritance. These markers include the allozyme products of 
numerous protein-coding genes as well as DNA-level “fingerprints” such as 
those provided by minisatellite loci, RAPDs, and (especially in recent years) 
microsatellites. 


Human Forensics 


Several of the techniques underlying molecular forensics were developed initial- 
ly for human applications and only later modified and adapted for application to 
wildlife issues. Thus, this treatment will begin with a discussion of DNA-based 
forensics as applied to people. 
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History of laboratory approaches 


The earliest efforts in human molecular forensics involved typing various 
biood groups and serum proteins, but these markers registered only mod- 
est genetic variation and hence offered limited evidence on individual 
identity and uniqueness. At the DNA level, the first widely employed 
approach entailed RFLP analyses of minisatellite sequences, typically 
probed one locus at a time. Especially in the late 1980s and early 1990s, sev- 
eral private companies (e.g., Cellmark and Lifecodes) and governmental 
agencies (e.g, the U.S. Federal Bureau of Investigation) began routinely 
typing genomes using minisatellite loci. Several such loci in human popu- 
lations proved to display numerous alleles distinguished from one another 
(in gel mobility) by virtue of variable numbers of tandem repeat sequences 
(VNTRs) (Figure 5.1). 

However, DNA fragments at these minisatellite loci were typically sev- 
eral kilobases in size and were measured with some error, so determining the 
number of distinct allelic classes from a quasi-continuous distribution of 
fragment lengths was problematic (Devlin et al. 1991, 1992). In practice, 
grouping procedures were employed to pool fragments of similar length into 
allelic "bins," whose widths reflected magnitudes of experimental error 
across replicates (Budowle et al. 1991). Even so, each VNTR locus often 
exhibited 10-30 or more differentiable bins of alleles in a typical population 
sample. Most of these alleles were uncommon (Table 5.1), and nearly all indi- 
viduals were heterozygous. This extensive variation carried a key conse- 
quence: At any locus, the probability of a genetic match between randomly 


Number of observations 





DNA fragment size (kb) 


Figure 5.1 Frequency distribution of restriction fragments from the D2544 mini- 
satellite locus in Caucasian samples. Data are from Lifecodes, Inc. (After Devlin et 
al. 1992.) 





g Individuality and Parentage 163 


chosen individuals was low. As illustrated in Box 5.1, genotypic frequencies 
from several unlinked VNTR loci could then be statistically combined to cal- 
culate probabilities of observing any particular multi-locus DNA profile in a 
random person drawn from a baseline population for which allele frequen- 
cies were known. 

In the ensuing years, most human forensic laboratories switched to assays 
of the short tandem repeat (STR) sequences of microsatellite markers, and this 
remains the primary method in use today. Microsatellites are also highly 
variable, but they offer several advantages over minisatellite sequences: an 
effectively unlimited supply of loci for examination; shorter fragments and 


TABLE 5.1- Frequencies of alleles (bins) at four hypervariable je 
S081 3 MNTR loci in Caucasians” .—- - ig 





VNTR locus D1S7 Binned allele frequencies in 


Frequency in Caucasian samples at VNTR loci 
Binned Caucasians Afr. Amer.” D2544 D17579 D45139 
allele (n = 605) (n = 372) (n = 802) (n = 563) (n= 460) 

1 0.004 0.007 0.005 0.010 0.004 
2 0.006 0.009 0.003 0.003 0.010 
3 0.009 0.011 0.016 0.007 0.006 
4 0.012 0.007 0.024 0.004 0.014 
5 0.011 0.016 0.046 0.015 0.033 
6 0.014 0.020 0.034 0.223 0.024 
7 0.010 0.011 0.123 0.199 0.040 
8 0.029 0.035 0.106 0.263 0.047 
9 0.021 0.023 0.084 0.200 0.054 

10 0.014 0.030 0.049 0.029 0.071 

11 0.028 0.030 0.083 0.032 0.108 

12 0.031 0.026 0.039 0.010 0.190 

13 0.046 0.044 0.041 0.006 0.129 

14 0.067 0.069 0.039 0.095 

15 0.057 0.065 0.087 0.036 

16 0.061 0.073 0.089 0.036 

17 0.069 0.054 0.075 0.103 

18 0.055 0.051 0.022 

19 0.060 0.047 0.018 

20 0.063 0.063 0.008 

21 0.079 0.062 0.008 

22 0.077 0.060 

23 0.077 0.074 

24 0.032 0.017 

25 0.019 0.027 

26 0.050 0.071 


Source: These data were introduced by the Federal Bureau of Investigation into a criminal 
case in Athens, Georgia in May, 1991. They are part of a larger database that includes fre- 
quencies from additional VNTR loci in Caucasians, African Americans, and Hispanics. 

* Shown for comparison are allele frequencies at the D157 locus in a sample of African 
Americans. 
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BOX 5.1 Probabilities of Single-Locus and Multi-locus 
DNA Profiles 


Based on the Caucasian VNTR data in Table 5.1, these calculations assume ran- 
dom associations of alleles both within loci (Hardy-Weinberg equilibrium) and 
among loci (gametic phase equilibrium). 


(a) Probability that an individual is heterozygous at the D157 locus: 
h = 1- 2: [(0.004)? + (0.006)? + ...  (0.050)?] = 0.945 


(b) Examples of probabilities of particular allelic combinations at 
individual loci: 


5/22 Heterozygote atD1S7 — . 0.011x 0.077 x 2.0 = 0.001694 
6/8 . Heterozygote at D2544 0.034 x 0.106 x 2.0 = 0.007208 
5/9 Heterozygote at D17579 0.015 x 0.200 x 2.0 = 0.006000 
12/12 Homozygote at D45139 0.190 x 0.190 = 0.036100 


(c) Probability of the multi-locus DNA profile in (b): 


0.001694 x 0.007208 x 0.006000 x 0.036100 = 3 x 10? 


Suppose a crime suspect exhibited the multi-locus genotype shown in (b). 
Then, if the assumptions of the model are met, the probability of a match with 
a randomly drawn genotype from the Caucasian population is about one in 
333 million. 


smaller tandem repeat units (2-4 bp each), such that appropriate acrylamide 
gels can cleanly distinguish all alleles differing in size (i.e., there is no need for 
artificial binning); and a PCR basis for the procedure, meaning that even 
minuscule amounts of source tissue (a smidgen of dried blood, a hair, or a 
drop of saliva) suffices. Once a database on microsatellite allele frequencies is 
available for a reference population, the logic and procedures of data inter- 
pretation in a forensic context are basically identical to those described in 
Table 5.1 and Box 5.1 for minisatellite data. 

By the year 1990, more than 2,000 court cases in 49 states had used DNA 
evidence in civil litigation or criminal proceedings (Chakraborty and Kidd 
1991), and more than one legal expert predicted that DNA typing would be 
for the late twentieth century and beyond what traditional fingerprinting 
was to the nineteenth (Melson 1990). That prediction proved to be correct, 
and DNA fingerprinting has today become an integral component of mod- 
ern crime laboratories around the world (Butler 2001). Molecular analyses 
have revolutionized human forensic practice. However, as with convention- 
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al fingerprinting, DNA typing merely provides physical evidence that 
assigns particular individuals to, or excludes them from, particular tissue 
samples and, thus, must be used in conjunction with additional lines of evi- 
dence to resolve legal issues such as criminal guilt or innocence. 


History of controversies 


If relevant tissue samples left at a crime scene have yielded assayable DNA, 
two evidential outcomes are then possible: the samples do not match the 
suspect, in which case the forensic evidence may be declared exculpatory; or 
the tissues are declared a match. In the Western judicial tradition, where a 
suspect is considered innocent until proven guilty, the latter situation clear- 
ly focuses attention on the following question: What is the probability of a 
spurious DNA match? At face value, such probabilities calculated from 
DNA forensic data are infinitesimally small (3 x 10? in the example in Box 
5.1). Thus, a perfect multi-locus match is usually interpreted as establishing 
genetic identity “beyond reasonable doubt.” 

However, this type of conclusion is based on some key assumptions 
whose validity was questioned, thus placing DNA fingerprinting itself on 
trial, not long after the technique’s debut (Lander 1989; Lewontin and Hartl 
1991). Regarding the probability of a spurious genotypic match, one con- 
tentious issue was the premise that genotypic frequencies are independent 
across loci, an assumption that could be violated if (for example) there was 
pronounced nonrandom mating and population subdivision. Human pop- 
ulations are not entirely homogeneous, but rather exhibit genetic structure 
that can produce allelic correlations due to historical or cultural separations. 
(For example, alleles for blond hair and blue eyes, although independent in 
genetic transmission and mode of action, nonetheless are highly correlated 
in human populations due to historical associations and nonrandom mat- 
ing.) Thus, researchers soon analyzed the molecular data sets employed in 
human forensics for possible genetic correlations within or among loci 
(Risch and Devlin 1992; Weir 1992), Overt correlations were not found, at 
least for most of the genetic markers analyzed. 

However, the effect of population substructure on forensic conclusions 
is a matter of degree (Nichols and Balding 1991). Consider an extreme exam- 
ple in which a suspect belongs to a small inbred community that differs dra- 
matically in allele frequency from North American Caucasians overall. Use 
of a Caucasian database (as in Table 5.1) as a reference for calculating geno- 
typic probabilities clearly would be inappropriate, and the direction of error 
could work against an innocent defendant (e.g., the likelihood of a geno- 
typic match between an innocent suspect and another member of the local 
community who may actually have committed the crime is much higher 
than the probability of a match within a broader Caucasian population). To 
circumvent this problem, each relevant human "subgroup" might be speci- 
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fied separately and appropriate probabilities of a genetic match calculated 
accordingly for each specific case. Unfortunately, such extensive genetic 
Characterization is infeasible logistically, even if appropriate subgroups 
could somehow be identified. Such concerns led Lewontin and Hartl (1991) 
to conclude that then-current applications of DNA fingerprinting had seri- 
ous flaws as forensic evidence. 

The degree of human population substructure should not be overstated, 
however. In terms of allozyme and blood group polymorphisms, Lewontin 
(1972) had earlier argued that more than 90% of total genetic diversity in 
humans occurred within (rather that between) races and concluded that 
“our perception of relatively large differences between human races and 
subgroups... is indeed a biased perception, and that, based on randomly 
chosen genetic differences, human races and populations are remarkably 
similar to each other." Much the same conclusion applies to at least some 
DNA fingerprint loci as well (Balazs et al. 1989, 1992; Krane et al. 1992), as 
illustrated by the similar spectra of allele frequencies at D157 in Caucasian 
and African American populations (see Table 5.1). This and other observa- 
tions led Morton (1992) to conclude that Lewontin and Hartl's (1991) objec- 
tions to genotypic probability calculations were themselves "absurdly" con- 
servative in favor of the defense. 

The scientific brouhaha over DNA fingerprinting in legal forensics led 
to two major reports by the National Research Council (1992, 1996). In these 
reports, it was pointed out that a variety of conservative calculation proce- 
dures could be followed to diminish any bias against the defense. These 
procedures included the use of wider bins for grouping minisatellite frag- 
ments and the employment of observed rather than expected frequencies of 
single-locus genotypes in the reference population (to circumvent the 
assumption of Hardy-Weinberg equilibrium). The 1992 NRC report pro- 
posed another solution that would be highly conservative in favor of the 
defense: the "ceiling principle." Under this suggestion, allele frequencies 
would be estimated at all marker loci in 15-20 human populations repre- 
senting a diversity of ethnic groups. For each allele, its highest frequency in 
any population or 0.05, whichever is higher, would then be employed in an 
estimate of expected genotypic frequencies against which to evaluate a sus- 
pect's genotypic profile. This method overestimates the expected frequency 
of genotypes in the reference database, but the effect is such that any error 
introduced ís in the direction of decreasing the chance that an innocent sus- 
pect is falsely convicted. 

Apart from these and other statistical issues, the NRC reports also 
addressed a variety of technical matters, such as proper handling and pro- 
cessing of samples at all steps in the investigation, standardization and val- 
idation of molecular procedures and data, and accreditation and monitoring 
of forensic laboratories. Fortunately, all of these concerns have become less 
worrisome over the years as laboratory techniques and background data- 
bases have improved and expanded and tighter standards for quality con- 
trol at all levels have been widely adopted. Furthermore, the genetic varia- 
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tion now available for typing at microsatellite loci is sufficiently high to 
have muted former criticisms about probability estimates of genotypic 
matches. In the late 1990s in the United States, a national database known as 
CODIS (Combined DNA Index System) was implemented. It involves 13 
core STR loci that are now employed widely in forensic analysis and which 
collectively yield an average random match probability of less than one ina 
trillion among unrelated individuals (Chakraborty et al. 1999). Mito- 
chondrial and Y-linked markers are also employed routinely when informa- 
tion is required about particular human matrilines or patrilines. Today, 
DNA typing via standardized batteries of molecular markers is routine 
practice in human forensics, and scientific controversies about the eviden- 
tiary power of DNA have mostly faded to a distant memory (Lander and 
Budowle 1994). 


Empirical examples 


In the United States, one of the first legal cases (Pennsylvania v. Pestinikas) to 
admit DNA as evidence came in 1986, and it involved use of PCR-based 
assays to analyze tissue samples from an exhumed corpse (Moody 1989). The 
first criminal conviction in the United States based in part on DNA evidence 
came in a 1987 rape trial: State v. Andrews, Orange County, Florida. This case 
established a legal precedent for the use of DNA typing to link a suspect to 
biological material (blood, semen, or hair follicles) left at a crime scene (Kirby 
1990; Roberts 1991). One of the earliest examples of DNA typing in a homi- 
cide case was also unusually bizarre: It involved a mortuary worker accused 
of killing and incinerating his estranged wife at a crematorium in Kansas (see 
Kirby 1990). Circumstantial evidence had implicated the worker in his wife's 
death, but he staunchly maintained that she had not been at the mortuary 
near the time of her disappearance. However, bloodstains discovered on the 
side of the crematorium proved by DNA typing to match other remaining tis- 
sue from the deceased woman. The mortuary worker was convicted of 
aggravated kidnapping and first-degree homicide. In another early example 
of the power of DNA typing methods, Hagelberg et al. (1991) used PCR to 
amplify DNA sequences from the 8-year-old skeletal remains of a murder 
victim. By comparing microsatellite DNA markers in the remains with those 
of the presumptive parents, the victim’s identity was established. 

Not all forensic applications of DNA typing involve crimes this 
macabre, but molecular genetic methods have provided powerful physical 
evidence in thousands of homicides, rapes, burglaries, assaults, hit-and-run 
accidents, missing persons, identifications of war-atrocity victims, and other 
cases. Forensic DNA methods, validation studies, and empirical examples 
are the focus of no less than three major scientific journals: Forensic Science 
International (Elsevier Science), International Journal of Legal Medicine 
(Springer-Verlag), and Journal of Forensic Sciences (American Society for 
Testing and Materials). Several interesting high-profile cases in human DNA 
forensics are summarized in Table 5.2. 
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TABLE 5.2 Uses of DNA typing in high-profile human forensic investigations 


General description and 
results of investigation 


O. J. Simpson "trial of the century." A famous football 
player was acquitted of the 1994 murder of his wife, 
despite DNA fingerprint evidence that by itself yielded 
an unequivocal match of the defendant's blood to 
the crime scene. 


Clinton-Lewinsky affair. A semen stain on a blue dress 
matched a U.S. president's genotype, thereby contradicting 
claims of no concrete evidence for a sexual liaison; this 
evidence played a role in impeachment proceedings. 


The Russian Czar. Skeletal remains presumed to be those of 
Nicholas II, the Russian Czar killed in the Bolshevik 
revolution of 1918, were exhumed, genotyped, and 
compared with surviving members of the family; 
results confirmed the identity of the bones. 


Arlington Cemetery, Tomb of the Unknown Soldier. 
By comparing the genotype of skeletal remains in this 
famous monument to members of candidate surviving 
families, the identity of a "Vietnam Unknown” in the 
Tomb was confirmed. 


Armed Forces forensics. In 1992, personnel entering the 
U.S. Army through six basic training sites donated 
samples for DNA typing. This program soon expanded 
dramatically. Several million such specimens have been 
collected; they have been used to identify remains, for 
example, in the Gulf Wars and in terrorist bombings 
of military personnel. 


Branch Davidian fire. In the first mass-disaster investigation 
involving DNA evidence, charred remains were used to 
genotype and thereby identify victims of a federal assault 
on the Branch Davidian compound in Waco, Texas. 


SwissAir Flight 111. DNA typing helped to identify all 229 


people who died in the 1998 crash of this trans-Atlantic flight. 


Jefferson-Hemings affair. Y-chromosome markers established 
that Thomas Jefferson sired children by one of his slaves, 
Sally Hemings. The approach involved tracing paternal 
lineages back from living descendants of the Jefferson 
and Hemings families. 


Human rights abuses. Nuclear and mtDNA markers have 
solved missing-persons and other cases of human rights 
abuses in politically repressive regimes and in war-torn 
areas throughout the world, from Argentina to the Balkans. 


Note: For more details, see the references cited and Butler 2001. 





Reference 


Levy 1996 


Grunwald and Adler 1999 


Gill et al. 1994 


Holland and Parsons 1999 


www.afip.org/Departments/ 
oafme/dna/ 


Clayton et al. 1995 


Butler 2001 


Foster et al. 1998 


Owens et al. 2002 
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Individuality and Parentage 


Ramets and Genets 
Background 


Many species of invertebrate animals and plants exhibit facultative asexu- 
al (clonal) as well as sexual reproduction (Jackson et al. 1985). For example, 
each colony of staghorn coral (Acropora cervicornis) consists of numerous 
asexually derived polyps that are genetically identical to one another and 
to the sexually produced planula larvae from which they arose. These 
polyps are housed jointly in a secreted calcareous skeleton that breaks occa- 
sionally, producing physically disjunct "daughter" colonies whose mem- 
bers are also genetically identical. Some coral species also produce disper- 
sive asexual larvae (Stoddart 1983a). In various other invertebrates, mech- 
anisms of asexual proliferation may include clonal production of larva-like 
propagules, sornatic fragmentation, polyembryony (production of multiple 
individuals by division of an early embryo or zygote), or parthenogenesis 
(Blackwelder and Shepherd 1981; Eaves and Palmer 2003; Jackson 1986). 
Similarly, in plants, clonal proliferation can involve runners, stolons, rhi- 
zomes, bulbs, root or stem suckers, plant fragments, or even highly disper- 
sive asexual (apomictic) seeds, which can arise when a non-meiotic cell in 
the ovarian wall initiates seed formation or when failure of a reduction 
division in the germ line produces eggs with a full complement of chromo- 
somes from the maternal parent (Cook 1980; Vielle-Calzada et al. 1996). 
One example of asexual reproduction involves quaking aspen trees 
(Populus tremuloides), which produce sexual seeds, but also can proliferate 
vegetatively via buds that sprout from roots of mature specimens. Death of 
the mother stem may then result in the physical disconnection of clone- 
mates. As phrased by Harper (1985), “It is the nature of many plant and 
animal growth forms that the organism dies in bits and continues growth 
as separated parts." 

In species with such mechanisms of clonal reproduction, challenging 
questions arise, such as, What constitutes an individual? What are the units 
of selection? (Buss 1983, 1985). Harper (1977) defined the genetic individual, 
or "genet," to include all entities (however physically organized) that have 
descended from a single sexually produced zygote and, hence, are geneti- 
cally identical to one another (barring mutation). By contrast, a "ramet" is 
an individual in a physical or functional sense—a physiologically or mor- 
phologically coherent module having arisen through clonal replication. 
Thus, a genet may consist of many modular ramets, asexually derived. 
Many evolutionary interpretations of field data hinge critically on the cor- 
rect distinction of clonemates from non-clonemates. For example, secure 
genetic knowledge of which ramets ultimately derive from the same zygote 
is necessary for drawing proper inferences about sex ratios within 
sexual-asexual populations, magnitudes and patterns of effective gene flow, 
degrees of outcrossing and the mating system, extents of interclonal compe- 
tition, and the evolutionary ages of clones (Cook 1983, 1985). 
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In many cases in which a species' mode of reproduction is unknown, 
but asexuality is suspected, molecular genetic markers can help to settle the 
issue. The evidence often involves family data. For example, by showing 
that assayed siblings were genetically identical to their parent plant at sev- 
eral polymorphic allozyme loci, Roy and Rieseberg (1989) confirmed the 
occurrence of apomixis (reproduction without fertilization) in the mustard 
plant Arabis holboellii, a species that had been suspected of clonality from 
other evidence (e.g., occurrence of pollen with unreduced chromosome 
number). Conversely, some species suspected of clonal reproduction prove 
upon molecular examination to be capable of outcrossing, as was demon- 
strated by the observation of recombinant RAPD genotypes in a benthic 
freshwater byrozoan, Cristatella mucedo (Jones et al. 1994). In other cases, 
molecular evidence for clonal reproduction may involve population genetic 
data. For example, Nybom and Schaal (1990) used DNA fingerprinting 
assays to show that a population of blackberry (Rubus pensylvanicus)—a 
species suspected of frequent asexual reproduction—had many fewer 
recombinant genotypes than a congener with predominant sexual repro- 
duction (the black raspberry, R. occidentalis). Similarly, Graham et al. (1997) 
used RAPD markers to show that the primary mode of reproduction in the 
red raspberry (R. idaeus) is sexual. 

A wide variety of species have been subject to such molecular 
appraisals of clonality. Many marine benthic algae release spores into the 
water column, but whether these are sexual or asexual propagules remains 
uncertain. For one such species (Enteromorpha linza), Innes and Yarish 
(1984) documented from allozyme markers that the spores are clonal. 
Conversely, sexual reproduction was documented by allozyme markers in 
both a free-living amoeba, Naegleria lovaniensis (Pernin et al. 1992), and a 
fungal pathogen, Crumenulopsis sororia (Ennos and Swales 1987). Many 
marine invertebrates brood their young, and it is of interest to know 
whether these larvae are the products of sexual or asexual reproduction. 
Using allozyme assays, Black and Johnson (1979) showed that brooded 
young of the intertidal anemone Actinia tenebrosa were genetically identical 
to their parents, indicating asexual reproduction. Similarly, Ayre and 
Resing (1986) documented asexual reproduction for two coral species 
(Tubastraea diaphana and T. coccinea). On the other hand, in two other coral 
species assayed for allozymes (Acropora palifera and Seriatopora hystrix), 
nonparental genotypes were detected in a majority of larval broods, thus 
indicating sexual recombination. 

For many animal populations, the first suggestion of parthenogenesis, 
wherein progeny develop directly from an unfertilized and unreduced 
female gamete (Soumalainen et al. 1976), comes from the indirect evidence 
of a strongly female-biased sex ratio. Clonal reproduction then may be con- 
firmed with genetic markers: True ameiotic parthenogens derived from a 
single female are genetically uniform, barring post-formational mutations 
(Hebert and Ward 1972), and clonal parthenogenetic populations that arose 
through recent hybridization exhibit “fixed heterozygosity” at loci distin- 
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guishing the parental species (Dessauer and Cole 1986). For example, 
Echelle and Mosier (1981) used allozyme evidence to confirm that a popu- 
lation of silverside fishes in the Menidia clarkhubbsi complex reproduces by 
clonal means, as did Dawley (1992) for two killifish populations that proved 
to have arisen through crosses between the sexual species Fundulus hetero- 
clitus and F. diaphanus. In some populations of the parthenogenetic aphids 
Myzus persicae and Sitobion avenae, DNA fingerprinting assays revealed char- 
acteristic genetic signatures that confirmed suspected modes of clonal 
reproduction (Carvalho et al. 1991; Simon et al. 1999). As discussed below, 
molecular markers have likewise substantiated clonal reproduction in many 
other vertebrate and invertebrate animals, as well as in microbes. 

The genetic hallmark of clonal reproduction is the stable transmission of 
genotypes across generations, without the shuffling effects of genetic recom- 
bination (the only source of variation therefore being mutation). Thus, the 
phrase “clonal reproduction” is sometimes also used to describe genetic 
transmission in self-fertilizing hermaphrodites, in which the intense 
inbreeding that characterizes this reproductive mode can result over the 
generations in near homozygosity at most loci. Although selfing organisms 
may retain meiosis and syngamy (union of gametes, in this case from a sin- 
gle parent), genetic segregation and recombination in effect are suppressed 
once homozygosity through inbreeding is achieved. 

This latter phenomenon is well illustrated by a cyprinodontid fish, 
Rivulus marmoratus—the only known vertebrate hermaphrodite with regu- 
lar self-fertilization (Harrington 1961). This species exists in nature as high- 
ly homozygous “clones,” as was initially revealed by intraclonal fin graft 
acceptances (indicating near-identity at histocompatibility loci; Harrington 
and Kallman 1968; Kallman and Harrington 1964) and complete homozy- 
gosity at more than 30 allozyme loci (Vrijenhoek 1985). Subsequent DNA- 
level studies based on minisatellite loci (Turner et al. 1990, 1992) revealed 
the presence of additional clonal genotypes, most of which probably arose 
via recombination, suggesting that outcrossing had also taken place, albeit 
infrequently at most sites (Laughlin et al. 1995; Lubinski et al. 1995). More 
variation was then uncovered in molecular assays of MHC loci, but the het- 
erozygosity in this case was probably due to long-term retention of alleles 
in different strains that had arisen from ancestral outcrossing forms (Sato et 
al. 2002). 

These kinds of genetic observations highlight a cautionary note that 
applies to studies of "clonal diversity" in strongly inbreeding species (as 
well as in truly asexual taxa): The absolute number of recognized clones can 
depend on the discriminatory power of the molecular (or other) assay used 
as well as the reproductive biology and evolutionary history of a species. 
Thus, it is not enough to be concerned with mere tallies of identifiable geno- 
types. Of much greater importance are establishing historical relationships 
among genotypes and understanding the biological processes (including 
selection as well as reproductive mode) that may have forged and main- 
tained whatever genetic variation is observed. 
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Spatial Distributions of Clones 


The geographic distribution of genotypes in a facultatively asexual species 
is influenced by the relative frequencies of sexual versus clonal proliferation 
and by the dispersive characteristics of propagules produced under each 
reproductive mode. Early attempts to illuminate these population genetic 
structures utilized indirect phenotypic criteria for clonal identification, such 
as distributions of morphological attributes or phenologies in plants (Barnes 
1966), agonistic behaviors in sea anemones (Sebens 1984), and histocompat- 
ibility responses (acceptance versus rejection of tissue grafts) in marine 
invertebrates and unisexual vertebrates (Cuellar 1984; Neigel and Avise 
1983a; Schultz 1969). Subsequent attempts to assign ramets to genets and 
map the spatial distributions of clones often involved direct molecular 
genetic assays (Elistrand and Roose 1987; Silander 1985). 


PLANTS AND ALGAE. Maddox et al. (1989) used allozyme genotypes at four 
polymorphic loci to map the microspatial distributions of goldenrod 
(Solidago altissima) clones in fields of various ages. In this case, different 
allozyme genotypes almost certainly reflected sexual recruitment (via seeds) 
into the population, whereas multiple ramets of the same genotype regis- 
tered asexual proliferation via rhizomes. Patterns of dispersion of goldenrod 
genotypes differed among fields: Clones were localized in the youngest 
plots (Figure 5.2), whereas older fields exhibited greater spatial intermixture 
of clones and fewer remaining rhizome connections among ramets. 
Apparently, colonization of a field by sexually produced seeds is followed 
by ramet proliferation and eventual spatial mixing of clones over microgeo- 
graphic scales. In other such allozyme assessments of local clonal reproduc- 
tion, Burke et al. (2000) discovered that approximately 50% of genets in two 
populations of Iris hybrids consisted of multiple ramets originating from 
vegetative proliferation. Murawski and Hamrick (1990) found that clonal 
growth in the bromeliad Aechmea magdalenae had resulted in the spread of 
ramets over distances of several meters. In the columnar cactus Lophocereus 
schottii, Parker and Hamrick (1992) found that most clonemates were tight- 
ly aggregated, but also that a few individuals were separated from their 
clonemates by more than 70 m (probably as a result of detached stem pieces 
washing downstream during floods). Using AFLP markers, Suyama et al. 
(2000) identified a large clone of bamboo (Sasa senanensis) in Japan that 
occupied an area about 300 meters in diameter. 

Molecular markers have also been used to study clonal distributions in 
various bush and tree species. Torimaru et al. (2003) found from RAPD 
analyses that patches of holly (Ilex leucoclada) were composed partly of dif- 
ferent genets and partly of multiple ramets (stems in this case) within a given 
clone. Applying allozyme approaches to arctic dwarf birch (Betula glandu- 
losa), Hermanutz et al. (1989) identified single clones encompassing areas of 
at least 50 m?. Although normally a sexual species, dwarf birch at the study 
site had apparently reproduced by "vegetative layering," wherein prostrate 
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Figure 5.2 Microspatial map of allozyme-identified clones in Solidago altissima. 
Each circle denotes a current living ramet; total map area 0.75 m?. Different num- 
bers indicate distinct electrophoretic genotypes, and shaded areas encompass mul- 
tiple nearby ramets apparently belonging to each genet. (After Maddox et al. 1989.) 


branches beneath the moss layer produced new ramets vegetatively. In a 
local population of scrub oaks (Quercus geminata) on Merritt Island, Florida, 
Ainsworth et al. (2003) used microsatellite markers to identify and map 
clones that arose from vegetative proliferation via suckers. 

A form of apomictic reproduction in plants and algae that could result 
in unusually widespread dispersion of clones is agamospermy, the forma- 
tion of unreduced spores, seeds, or embryos by asexual processes (Bayer 
1989, Hughes and Richards 1989). For example, in an agamospermous 
marine alga (Enteromorpha linza) that can produce water-dispersed asexual 
spores, particular allozyme-identified clones were distributed over the 
entire survey transect of more than 150 shoreline kilometers (Innes 1987). 
Within each of two obligate agamospermous populations of dandelions 
(Taraxacum sp.), all individuals proved to be genetically identical at 15 
allozyme loci, whereas related sexual populations were highly diverse 
genetically (Hughes and Richards 1988). Based on additional molecular evi- 
dence, other apomictic dandelion populations showed considerable genetic 
variation, perhaps evidencing the coexistence of multiple clones (Ford 1985; 
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Ford and Richards 1985). In general, however, a serious complication in 
interpreting clonal diversity in ancient and widespread "agamospecies" is 
in distinguishing sexually derived genetic variation from that which may 
have arisen via post-formational mutations (Brookfield 1992; see below). In 
a molecular study of Taraxacum dandelions in Norway, Mes et al. (2002) 
attributed much of the genetic variation observed in STRs and AFLPs to be 
the result of mutation accumulation within an ancient clone. On the other 
hand, using rDNA and cpDNA analyses of triploid dandelions in North 
America, King (1993) uncovered high genetic variation, much of which she 
attributed to multiple hybridization events that produced these apomicts. 
According to Cook (1980), the record holder for size and age of a plant 
clone may be the quaking aspen. Based on a distinctive morphological 
appearance and spatial arrangement, one suspected genet was represented by 
more than 47,000 ramets (covering 107 acres) that might have traced to a sin- 
gle seed perhaps deposited several thousand years ago at the close of the 
Wisconsin glaciation (Kemperman and Barnes 1976). Looks might be deceiv- 
ing, however. In allozyme surveys of other quaking aspen populations, 
Cheliak and Patel (1984) found that several “clones” provisionally identified 
by morphology actually were composed of several distinct electrophoretic 
genotypes that probably had arisen through recombination (and hence sexu- 
al reproduction). They concluded that environmental influences on pheno- 
type invalidate morphological appraisals of aspens as a reliable guide to clone 
identification. In contrast, DNA-level assays can provide definitive informa- 
tion on clonal identities and distributions in this species (Rogstad et al. 1991). 


FUNGI. Molecular documentation of clonal identity is also available for the 
honey mushroom, Armillaria bulbosa, in which one gigantic clone identified 
by mtDNA and nuclear RAPD markers was claimed at the time to be one of 
the Earth's largest and oldest individuals of any species (M. L. Smith et al. 
1990, 1992). This pathogenic fungus of tree roots lives in mixed hardwood 
stands, where it can spread vegetatively by cordlike aggregations of hyphae 
that weave across the forest floor. The molecular markers revealed that one 
presumably interconnected clone of A. bulbosa in northern Michigan had 
spread across 37 acres, weighed in aggregate more than 90,000 kg (about the 
size of an adult blue whale), and was perhaps 1,500 years old. A smaller clone 
nearby covered a mere 5 acres. No doubt even larger and older clones in this 
or other fungal species remain to be discovered. 


INVERTEBRATE ANIMALS. Most sea stars (Asteroidea, Echinodermata) can 
reproduce asexually by fission, whereby detached arms regenerate new bod- 
ies. Johnson and Threlfall (1987) used allozyme markers to estimate the occur- 
rence of fission versus sexual reproduction in the sea star Coscinasterias cala- 
maria in Western Australia. On local scales, clonal reproduction proved to pre- 
dominate, such that many individuals within 50 m of one another were clone- 
mates, but sexual recruitment was important as well, as evidenced by the fact 
that distinct genotypes were present in different parts of the study site. 
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j -In the soft coral Alcyonium rudyi, members of allozyme-documented 

| clones produced by binary fission were typically grouped within 50 cm of 

| one another on the same rock surface (McFadden 1997). Hard coral colonies 
can also proliferate clonally, in this case by fragmentation. An allozyme sur- 
vey of Pavona cactus revealed the significance of this asexual process in dis- 
tributing clonemates over distances of up to nearly 100 meters along reefs in 
eastern Australia (Ayre and Willis 1988). Some corals, such as Pocillopora 
damicornis, also produce dispersive planula larvae by asexual means, which 
presumably accounts for an observation by Stoddart (1984a,b) that particu- 
lar clones identified by protein electrophoresis were dispersed over dis- 
tances up to several kilometers. 

In a sea anemone (Actinia tenebrosa) that can produce brooded young 
asexually, particular clonal genotypes identified by allozyme assays were 
distributed over hundreds of meters of shoreline in Australia (Ayre 1984). 
In lagoons surrounding the United Kingdom, RAPD assays revealed that 
about 60% of surveyed individuals of the anemone Nematostella vectensis 
were genetically identical, probably as a result of clonal proliferation in 
conjunction with historical population bottlenecks (Pearson et al. 2002). On 
the other hand, allozyme surveys of the intertidal sea anemone Oulactis 
muscosa (a species that can reproduce by fission) revealed a population 
genetic structure consistent with recruitment almost exclusively by sexual 
reproduction (Hunt and Ayre 1989). In yet another sea anemone species 
similarly surveyed (Metridium senile), various populations in northeastern 
North America evidenced large differences among sites in the frequencies 
with which sexual versus clonal recruitment had taken place (Hoffman 
1986). 

In several marine invertebrate groups, clones were traditionally distin- 
guished using various nonmolecular sources of information, such as aggres- 
sive behaviors among sea anemones or morphotypic appearances of corals 
and sponges (Ayre 1982; Ayre and Willis 1988; Solé-Cava and Thorpe 1986). 
Another source of data on putative clonal identities was histocompatibility 
responses. Within many coral and sponge species, for example, artificial 
grafts between colony branches exhibit either an acceptance or rejection 
reaction, and indirect evidence suggested that these two responses signal 
clonal identity and non-identity, respectively (Hildemann et al. 1977; Neigel 
and Avise 1983b; review in Avise and Neigel 1984). However, this possibili- 
ty was not always fully corroborated in more direct genetic analyses (Curtis 
et al. 1982, Neigel and Avise 1985; Resing and Ayre 1985; see also Hunter 
1985). Instead, reports appeared of occasional graft rejections between 
colonies that were identical in multi-locus allozyme genotype and of graft 
acceptances between some colonies that differed in allozyme genotype 
(Table 5.3). The former observation may simply reflect the likelihood that 
small numbers of surveyed allozyme loci failed to distinguish all clones. The 
latter observation may reflect imperfect discriminatory power at histocom- 
patibility genes themselves. Like the histocompatibility systems of verte- 

| brates (Parham and Ohta 1996) and the self-incompatibility systems of many 
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TABLE 5.3 Levels of agreement between histocompatibility response and 
allozyme genotype as possible indicators of clones in local 
populations of marine invertebrates 





Tissue graft response 

Allozyme genotype Accept Reject 
Identical 

Niphates 23 5 

Montipora 26 12 
Non-identical 

Niphates 5 32 

Montipora 21 35 


Note: Results shown are for a sponge, Niphates erecta, assayed in 65 pairs of colonies at three 
polymorphic loci (data from Neigel and Avise 1985); and the corals Montipora dilatata and M. 
verrucosa, assayed in a total of 94 pairs of colonies at one polymorphic allozyme locus (data 
from Heyward and Stoddart 1985). For both Niphates and Montipora, associations between 
histocompatibility response and allozyme genotype were statistically highly significant, but 
nonetheless imperfect. 


outcrossing plants (Charlesworth 1995; Richman and Kohn 1996), tissue 
recognition loci in several marine invertebrates are known to be highly poly- 
morphic (Grosberg 1988; Grosberg et al. 1996, 1997), yet they too can fall 
somewhat short of perfection in distinguishing genetic self from nonself (see 
the section on genetic chimeras below). 

Various DNA fingerprinting methods are especially powerful in studies 
of clonal population structure. In an early example, Coffroth et al. (1992) uti- 
lized minisatellite probes to study DNA fingerprints and clonal structure in 
a gorgonian coral (Plexaura sp.) that reproduces by fragmentation as well as 
by sexual production of dispersive larvae. Among 73 scrutinized colonies on 
seven reefs in Panama, 29 different genotypes were identified in these 
molecular assays. Identical DNA fingerprints were observed only in adja- 
cent colonies on a reef. In some cases, both DNA fingerprinting and histo- 
compatibility assays were conducted, and these approaches revealed simi- 
lar numbers of putative clones: 17 and 13, respectively. Also worthy of men- 
tion were experimental controls demonstrating that multiple samples from 
a single colony produced identical DNA fragment profiles, and that symbi- 
otic zooxanthellae (algae) living within the corals’ tissues were not the 
source of the DNA gel bands scored. A follow-up DNA fingerprinting study 
by Coffroth and Lasker (1998) provided spatial maps of particular Plexaura 
clones on a Panama reef (Figure 5.3). 

Many invertebrates can proliferate clonally by parthenogenesis. For 
example, 17 of 33 North American species in the earthworm family 
Lumbricidae exhibit a parthenogenetic reproductive mode that apparently 
evolved from an ancestral hermaphroditic condition (Jaenike and Selander 
1979). In one such species (Octolasion tyrtaeum), Jaenike et al. (1980) used 
allozyme markers to identify eight distinct clones, two of which were wide- 
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| Figure 5.3 Underwater mesospatial map of Plexaura kuna clones on a Panama 

| reef, as identified by DNA fingerprints. Shaded areas denote reef structure; light 
areas are sand channels. Each black dot is a living ramet, and continuous lines 

| encircle ramets that belong to a given genet. (After Coffroth and Lasker 1998.) 


spread and common in diverse soil types over several thousand square 
kilometers surveyed in the eastern United States. Thus, some asexual line- 
ages apparently can occupy broad niches and achieve great success, at least 
over short ecological time. Many freshwater gastropods (snails) likewise 
reproduce parthenogenetically (Jarne and Delay 1991). Allozyme studies of 
the polyploid parthenogen Thiara balonnensis in Australia revealed that each 
local population generally consisted of only one clone, with genetic distance 
among clones correlated with geographic distance (Stoddart 1983b). This 
pattern of variation was postulated to result from the gradual evolution of 
new clones by mutational processes (as opposed to occasional sexual repro- 
duction) in conjunction with geographic separations. 
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A converse example of high clonal diversity over microgeographic scales 
was encountered in early studies of another species—the cladoceran Daphnia 
pulex—that exhibits obligate parthenogenesis throughout much of its range 
(Hebert and Crease 1980). Initial surveys revealed a total of 22 allozyme-iden- 
tified clones in eleven populations, with up to seven genotypes coexisting in 
a single lake. Subsequent studies of D. pulex using mtDNA and allozyme 
markers revealed many additional clones and also demonstrated that obligate 
parthenogenesis had a polyphyletic origin from facultative parthenogenesis 
within this species (Crease et al. 1989; Hebert et al. 1989). In a related species, 
D. magna, tens to hundreds of clones sometimes coexist within a pond (Hebert 
1974a,b; Hebert and Ward 1976). Where D. magna occurs in temporary habi- 
tats, it is a cyclic parthenogen, producing drought-resistant sexual eggs 
(requiring fertilization) each year. These zygotes reestablish populations that 
then are maintained by two or three generations of clonal parthenogenesis 
until the pond dries up again. Interestingly, in permanent habitats, D. magna 
reproduces continually by parthenogenesis and tends to exhibit fewer clonal 
types (Hebert 1974c). Several more molecular surveys of Daphnia species have 
appeared in recent years (see Colbourne et al. 1998 and citations therein). 


VERTEBRATE ANIMALS. Many vertebrate species, including humans, also 
produce clonemates occasionally—that is, whenever a pregnancy involves 
identical (monozygotic) twins. Although such instances are usually sporadic, 
in one mammalian taxonomic group—armadillos in the genus Dasypus—this 
reproductive mode is constitutive (Loughry et al. 1998a). In D. novemcinctus, 
for example, each litter typically consists of four pups that are genetically 
identical to one another, but distinct from both parents. This clonal mode of 
reproduction, known as polyembryony, differs from classic asexual repro- 
duction in that clonemates are intra- rather than intergenerational. This 
reproductive mode is also evolutionarily puzzling because it entails the pro- 
duction of "carbon copies" of a new and previously untested genotype (anal- 
ogous to parents buying multiple raffle tickets with the same number). 

One hypothesis to account for polyembryony in armadillos involves the 
notion of nepotism (favoritism toward kin). Perhaps armadillo clonemates 
within a litter help one another build dens, find food, or detect predators. 
Any genes responsible for polyembryony might have been favored across 
the generations if strong littermate cooperation was the norm and led to 
higher mean survival and reproduction. Such close cooperation would 
entail tight spatial associations among clonemates, an idea that was critical- 
ly tested using the clone-discriminating power of polymorphic microsatel- 
lites. In a large study population in northern Florida, Prodóhl et aJ. (1996) 
used these molecular markers to map the mesospatial distribution of 
armadillo clonal sibships (as well as to assess genetic parentage; Prodóhl et 
al. 1998). Armadillo clonemates (especially adults) proved not to be spatial- 
ly clustered. This finding, together with direct behavioral evidence 
(Loughry et al. 1998a,b), argues against the nepotism hypothesis for this 
remarkable instance of vertebrate polyembryony. 
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Ages of clones 


A common belief is that populations of clonally reproducing organisms 
must have short evolutionary life spans, due either to “Muller’s ratchet” 
(the accumulation of deleterious mutations and gene combinations that in 
the absence of recombination cannot readily be purged from populations; 
Muller 1964) or to a presumed lack of sufficient recombinant variety to 
allow adaptive responses to environmental challenges (Darlington 1939; 
Felsenstein 1974; Maynard Smith 1978; Williams 1975). However, the evi- 
dence cited above suggests that clonal lineages in some plants and inverte- 
brate animals can achieve fairly wide distributions and enjoy at least mod- 
erate-term ecological success. Nevertheless, virtually all extant asexual taxa 
represent only the outermost twig-tips (rather than deeper branches or 
major limbs) in the Tree of Life, indicating that they are geologically young 
and, in general, that asexual lineages are evolutionarily ephemeral. Among 
metazoans, primary exceptions to this rule might involve bdelloid rotifers 
and darwinulid ostracods, as described next. 


BDELLOID ROTIFERS. In the class Bdelloidea of the phylum Rotifera, all 360 
described species (belonging to 18 genera and four families) appear to lack 
males, hermaphrodites, meiosis, or any of the other trappings typically asso- 
ciated with sexual reproduction and genetic exchange (Butlin 2000; Mayr 
1963). Instead, these tiny freshwater creatures reproduce parthenogenetical- 
ly via direct-developing eggs produced by mitotic cell divisions, without 
reduction in chromosome number and without fertilization. Based on fossil 
evidence including amber-preserved specimens, bdelloid rotifers arose at 
least 35 million years ago. Has their long evolutionary survival and success 
truly been without benefit of sexual reproduction? If so, this would be some- 
what of an “evolutionary scandal” (Maynard Smith 1986) because conven- 
tional wisdom holds that genetic recombination has long-term as well as 
short-term adaptive importance (Butlin 2002). 

By examining nucleotide sequences at each of several nuclear loci in var- 
ious species of bdelloid rotifers and comparing the results with those for sex- 
ual species of rotifers and some other invertebrates, Mark Welch and 
Meselson (2000, 2001) tested the ancient asexuality hypothesis. Long-term 
asexual reproduction in the bdelloids was presumably evidenced by several 
genetic signatures, including: high sequence divergence between allelic pairs 
(the “Meselson effect”) even within single specimens, as expected if non- 
recombining alleles within an asexual diploid lineage have been maintained 
independently for long periods of time; and a relative paucity of transpos- 
able elements (Arkhipova and Meselson 2000), as might also be expected 
because, in theory, selfish transposable elements that proliferate within an 
asexual genome gain none of the personal fitness advantages they typically 
enjoy when housed in sexually reproducing hosts (Hickey 1982). On the 
other hand, in statistical analyses of these same sequence data, Gandolfi et al. 
(2003) advanced what they provisionally interpreted as lasting footprints of 
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otherwise cryptic genetic recombination in the past, so perhaps even the 
bdelloid rotifers have not been entirely free of sex. 


OSTRACODS. Tiny crustaceans in the family Darwinulidae offer another 
likely example of unisexual complexes with great evolutionary antiquity, 
perhaps as much as 200 millioh years (Martens et al. 2003). DNA sequence 
analyses of several nuclear and mitochondrial regions in the asexual ostra- 
cod Darwinula stevensoni documented that genetic diversity in this taxon 
was quite low (Schón and Martens 2002, 2003), in contrast to the situation in 
bdelloid rotifers. At face value, this finding seems to contradict an expecta- 
tion for ancient asexuals—that is, that genetic variation should be high 
between alleles within lineages as well as between separate lineages. Low 
allelic diversity does not necessarily negate the possibility of ancient asexu- 
ality for the darwinulids, however, because several other factors might 
account for this result (Butlin 2000). For example, Schén and Martens (1998) 
provisionally favor a hypothesis of ancient asexuality in D. stevensoni 
accompanied by the postulated evolution of highly efficient DNA repair 
mechanisms that might have enabled these organisms to circumvent 
Muller's ratchet. 

In another ostracod family (Cyprididae), the freshwater species 
Cyprinotus incongruens displays a diversity of diploid, triploid, and 
tetraploid parthenogenetic clones; it also has sexual relatives (Turgeon and 
Hebert 1994). Molecular genetic studies have suggested that transitions to 
polyploidy have been common in this taxon and that asexuality in the com- 
plex has persisted for at least several million years (Chaplin and Hebert 
1997). However, these latter authors also emphasize how difficult it is to 
eliminate the possibility of independent and recent transitions to asexuality 
from closely related sexual ancestors, a caveat that applies to many studies 
that have proclaimed discoveries of ancient asexual lineages (Judson and 
Normark 1996; Little and Hebert 1996; see below). 


VERTEBRATE ANIMALS. Apart from polyembryony (discussed above), 
other modes of clonal or quasi-clonal reproduction are known in about 70 
vertebrate species (Dawley and Bogart 1989), for which the term "biotype" 
is often preferred because traditional species concepts hardly apply. These 
biotypes typically consist solely of females that propagate by parthenogen- 
esis or related reproductive modes (Figure 5.4). Essentially all unisexual 
vertebrates arose through hybridization between related sexual species; 
this aspect of their phylogenetic histories will be deferred to Chapter 7. 
Here we consider the evolutionary ages of vertebrate unisexual lineages as 
inferred from molecular data (primarily from mtDNA). 

Two conceptual approaches to assessing vertebrate clonal ages have 
been attempted. The first involves estimating the genetic distance between 
a unisexual biotype and its closest sexual relative. In a review of 24 unisex- 
ual vertebrate lineages (Avise et al. 1992c), 13 (54%) proved indistinguish- 
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Figure 5.4 Three modes of unisexual reproduction in vertebrates. In partheno- 
genesis, the female's nuclear genome is transmitted intact to the egg, which then 
develops into an offspring genetically identical to the mother. In gynogenesis, the 
process is similar, except that heterospecific sperm (indicated by an asterisk) from a 
related sexual species is required to stimulate egg development. In hybridogenesis, 
an ancestral genome from the maternal line is transmitted to the egg without 
recombination, whereas paternally derived chromosomes are discarded pre-meioti- 
cally, only to be replaced each generation via fertilization by heterospecific sperm 
(indicated by asterisk) from a related sexual species. (After Avise et al. 1992c.) 


able in mtDNA assays from an extant genotype in the related sexual taxon, 
indicating extremely recent evolutionary origins for these unisexuals. Five 
additional unisexual lineages differed from their closest sexual relatives by 
less than 176 in mtDNA sequence, suggesting origination times within the 
last 500,000 years under a standard mtDNA clock (Figure 5.5A). A few 
unisexual haplotypes showed greater sequence differences from related sex- 
ual forms, and these differences translated into literal estimates of evolu- 
tionary durations of perhaps a few million years. However, there is a seri- 
ous reservation about the relevance of such estimates: Closer relatives with- 
in the sexual progenitor lineage may have gone extinct after the separation 
of the unisexual biotype, or otherwise remained unsampled, such that uni- 
sexual ages could be grossly overestimated by this approach. Indeed, 
because of the low mtDNA lineage diversity observed within most unisex- 
ual taxa relative to their sexual cognates (Figure 5.5B), most authors have 
concluded that nearly all unisexuals arose very recently, even when geneti- 
cally close mtDNA lineages were not observed among the sexual relatives 
sampled (e.g., Vyas et al. 1990). 
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Figure 5.5 Genetic patterns in unisexual vertebrates. (A) Frequency distribution 
of the smallest genetic distances observed between mtDNA genotypes in 24 unisex- 
ual biotypes and their closest assayed sexual relatives. Also shown are associated 
evolutionary ages of the unisexuals based on the conventional mtDNA clock cali- 
bration of 296 sequence divergence per million years between lineages. (B) 
Nucleotide diversities in mtDNA within 13 sexual species (above horizontal axis) 
and their respective unisexual derivatives (below horizontal axis), arranged in rank 
order from left to right by the magnitude of variation within the various sexual 
taxa. (After Avise et al. 1992c.) 


One example in which an ancient clonal age was promulgated involved 
gynogenetic mole salamanders in the genus Ambystoma. From comparisons 
of mtDNA sequences in the unisexuals and in their extant sexual relatives, 
Hedges et al. (1992a) and Spolsky et al. (1992) estimated that the gynogens 
had had evolutionary durations of about 4 to 5 million years. However, this 
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evidence for clonal persistence comes with a serious caveat: namely, that 
these salamanders are unusual among vertebrate unisexuals in that their 
evolution may not be strictly clonal. Other molecular data suggest that they 
continually acquire nuclear DNA from sexual species, presumably via occa- 
sional incorporation of sperm DNA into the egg. If so, the antiquity of these 
"clonal" salamander biotypes applies strictly only to the mtDNA lineages 
that they contain. 

A second approach to estimating clonal ages involves assessing the 
scope of genetic variation within particular unisexual clades that, accord- 
ing to independent evidence, are monophyletic (i.e., originated from a sin- 
gle hybridization event, in this case). In principle, this method avoids con- 
founding post-formational mutational variation that might be indicative of 
ancient lineage age with genetic diversity that collectively arose from sep- 
arate hybridization events. Quattro et al. (1992a) examined mtDNA and 
allozyme variability within a hybridogenetic clade of fishes in northwest- 
ern Mexico (Poeciliopsis monacha-occidentalis) that, according to independent 
zoogeographic evidence and tissue graft analyses, was indeed mono- 
phyletic. The molecular data confirmed the clade's monophyly and also 
documented considerable genetic diversity within it, including the accu- 
mulation of several mitochondrial and allozyme mutations in the monacha 
portion of its genome (which comes from the female parent). From the 
magnitude of this post-formational diversity, the authors estimated that the 
unisexual clade was more than 100,000 generations old. However, concern- 
ing the relevance of this conclusion to broader arguments about clonal per- 
sistence, there is again at least one reservation: These fishes reproduce by 
hybridogenesis, which means that only the maternal component of their 
genetic heritage is strictly clonal. The nuclear genome receives fresh but 
transient sexual input each generation from a sire, P. occidentalis, such that 
the overall genetic system is “hemiclonal.” 

In any event, Maynard Smith (1992) argued that 100,000 generations in 
evolutionary terms “is but an evening gone,” and that the molecular find- 
ings for Poeciliopsis therefore do not contradict the conventional wisdom 
that organismal clones are short-lived. Regardless of one’s perspective on 
whether such time scales are “long” or “short” in the context of clonal per- 
sistence debates, molecular data have provided the first critical information 
regarding the evolutionary duration of vertebrate as well as invertebrate 
lineages (or portions thereof) that lack recombination. 


Clonal reproduction in microorganisms 


PROTOZOANS. Eukaryotic protozoans such as the agents of malaria, sleep- 
ing sickness, Chagas’ disease, and leishmaniasis infect more than 10% of the 
world’s human population and account for tens of millions of deaths every 
year. The classic assumption was that these parasites (most of which are 
diploid) routinely engage in sexual reproduction because recombination 
among strains had been observed in the laboratory and because a sexual 
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phase in the life cycle was thought to occur in particular host species. 
However, in the early 1990s, studies using molecular markers began to 
demonstrate that several parasitic protozoans can and often do propagate 
clonally in nature (Tibayrenc et al. 1990, 1991a). For example, analyses of 
scnRFLPs revealed that virulent strains of the protozoan Toxoplasma gondii 
from around the world are surprisingly homogeneous genetically, probably 
because of frequent clonal reproduction and evolutionary origins from a 
small number of nonvirulent strains that exhibit moderate polymorphism 
and are capable of sexual reproduction (Sibley and Boothroyd 1992). Recent 
molecular evidence has elaborated on this scenario by indicating the global 
predominance of three clonal lineages in T. gondii that all may have emerged 
within the last 10,000 years after a single genetic cross (Su et al. 2003). Such 
findings are of tremendous medical importance because of their relevance in 
diagnostic tests for disease agents, as well as in strategies for developing 
vaccines and curative drugs (Tibayrenc et al. 1991b). 


BOX 5.2 Population Genetic Criteria Suggestive of 
Clonal Reproduction 


This table presents two classes of inferences, based on four types of genetic obser- 
vations (criteria a-d) that would tend to indicate clonal reproductien in natural 


inferences (from observations) 


l. Absence of. meiotic segregation at particular marker loci " 
(a). Fixed heterozygosity (most or all individuals appear 
heterozygous; at least in some populations or subpopulations) 


(b) Significant deficit in frequencies of some expected 
diploid genotypes; other deviations from Hardy-Weinberg 
equilibrium 
II. Absence of recombination among multiple marker loci 
(c) Overrepresented, widespread identical genotypes; significant 
deficit of expected recombinant genotypes; non-random 
associations of alleles (gametic phase disequilibrium; see Box 5.3) 


(d) Correlation between independent sets of genetic markers 


Source: After Tibayrenc et al. 1991b. 
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The case for clonal reproduction in nature is particularly strong for 
Trypanosoma cruzi, the agent of Chagas’ disease. Extensive molecular studies 
of T. cruzi have involved strains isolated from humans, insect vectors, and 
mammals sampled throughout the range of the disease in Central and South 
America (Barnabé et al. 2000; Oliveira et al. 1998; Tibayrenc and Ayala 1987, 
1988; Tibayrenc et al. 1986; Zhang et al. 1988). These surveys have revealed 
several population genetic signatures characteristic of prevalent clonal repro- 
duction (Box 5.2): fixed heterozygosity and other evidence for an absence of 
segregation genotypes at individual loci; an overrepresentation of identical 
multi-locus genotypes, often geographically widespread; and significant cor- 
relations (disequilibrium) between independent sets of genetic markers (Box 
5.3), even after accounting for the effects of population subdivision (Tibayrenc 
1995). Furthermore, molecular markers indicate that two widespread groups 
of clonal genotypes in T. cruzi were the result of ancient hybridization events 
(Brisse et al. 2000; Machado and Ayala 2001). These findings overturned the 









populations. "Comments and caveats" are given in regard to eliminating the pos- 
sibility of frequent sexual reproduction. 






Comments and caveats 


Observation also incompatible with self-fertilization; -` 
must consider possibility, of gel mis-scoring due, 
for example, to gene duplication or polyploidy 


Missing heterozygotes also consistent with self- fertilization; 
must exclude effects of population subdivision, pects’ 
mating, selection, etc. 


Should consider possible effects of selection or population: ` 
subdivision; must take into account low expected frequencies 
of multi-locus genotypes when allelic variation is high 


Should consider possible effects of population subdivision 
or correlated selection pressures 
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BOX 5.3 Gametic Phase Disequilibrium 





Gametic phase disequilibrium is the nonrandom association between alleles at 
different loci. "Nonrandom" means that the multi-locus combinations of alleles 
depart significantly from expectations based on products of the single-locus 
allele frequencies. Consider the simplest possible case, involving two autoso- 
mal nuclear loci (A and B), each with two alleles (A,, A, and B,, B,), whose fre- 
quencies are p;, Py Jy and qy respectively. Four di-locus gametic genotypes (or 





haplotypes) are possible: 
Alleles at locus A 
A, (p) AP: 
B, (q,) | A,B, (Pa) ` A58, (p) 
Alleles at locus B 


Bi(qJ | AB, (P19) AB: (Pot) 


If alleles are associated at random in haplotypes (gametic phase equilibrium), the 
expected frequencies of these di-locus genotypes are 491, Pily P29, and pgz One 
quantitative measure of a departure from this expectation is given by the gamet- 
ic phase disequilibrium parameter, defined as 


D = PaPa - PP 


where P. and P», are. the observed frequencies of haplotypes in the "coupling" 3 
phase, and P,, and P,, are observed frequencies of haplotypes in the “repulsion” ‘ 
phase. It can be shown that in a-large randomly mating population, any initial 

disequilibrium [D(0)] among neutral alleles tends to ud toward zero (provid- 

ed that c #0) according to the equation 


D(GY= (1- 96 DO) 


where D(G) is the disequilibrium remaining at generation G, and c is the 
probability of a recombination event between.the two loci each generation (or 
the "recombination fraction"). Thus, for unlinked loci (c = 0.5), disequilibrium 
decays by one-half each generation. Disequilibrium decays more slowly as the $ 
recombination fraction decreases. The loci examined may also involve one & 
nuclear gene and one cytoplasmic gene, in which case the analogous nonran- 
dom gametic associations are referred to as “cytonuclear disequilibria” (as 
described later in Box 7.4). Because nuclear and cytoplasmic genes are un- 
linked, any initial disequilibria between such loci are likewise expected to $ 
decay monotonically to zero by one-half per generation in a ony mating b 
population. 

Gametic phase disequilibrium can arise from any historícal or contempo- x 
rary process that has restricted recombination among loci. These processes may 
include physical linkage of genes on a chromosome or other factors that also 
can generate nonrandom associations among the alleles of unlinked loci, such 
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as population subdivision, founder effects, mating systems with close inbreed- 
ing (see Chapter 6), or selection favoring particular multi-locus allelic combina- 
tions (Lewontin 1988). 

Genomes or portions thereof characterized by a rarity or absence of 
recombination may be viewed as linked “supergenes” with regard to evolu- 
tionary dynamics. One likely consequence involves “genetic hitchhiking,” 
wherein alleles that are mechanistically neutral nonetheless may spread 
through a population because of chance association with an allele favored by 
natural selection at another locus. Such hitchhiking on rare favorable muta- 
tions could lead to “periodic selection” that has the net effect of purging pop- 
ulation genetic variability in nonrecombining systems (Levin 1981). The 
“selective sweeps” involved in periodic selection may account in part for the 
observation that genetic variation in mtDNA (see Chapter 2) and in some bac- 
terial species (Milkman 1973) is vastly lower than might otherwise have been 
expected (given suspected mutation rates to neutral alleles and apparent pop- 
ulation sizes in these systems). Such selective sweeps may also contribute to 
the observation for eukaryotic nuclear genomes that chromosomal regions 
characterized by low recombination rates sometimes have reduced nucleotide 
diversity (Aquadro and Begun 1993; Aquadro et al. 1994; Begun and Aquadro 
1992; but see also Hamblin and Aquadro 1999 for exceptions and interpretive 
complications). 


conventional wisdom that this species was an undifferentiated quasi-panmic- 
tic entity, and they carried major medical ramifications as well (Revollo et al. 
1998). On the other hand, further molecular evidence has emerged support- 
ing the contention that some natural populations of T. cruzi also experience 
regular occurrences of genetic exchange (Gaunt et al. 2003). 

The “clonal theory” for parasitic protozoans was rather heretical in the 
early 1990s, but it has become widely accepted as an important element 
(but seldom the whole story) of reproduction in several of these microbes 
(Table 5.4). Depending on the particular life cycle and other biological fac- 
tors, many parasitic protozoan species also engage occasionally, if not reg- 
ularly, in a variety of modes of sexual or recombinational genetic exchange, 
including horizontal gene transmission (see Chapter 8). This also means 
that when “clonal genotypes” occur in such quasi-sexual species, they can 
have varied evolutionary origins and diverse ages. One way to acknowl- 
edge such heterogeneity for partially clonal taxa, especially in medical 
applications, is to employ additional clarifying terms. One such moniker is 
“clonet,” defined by Tibayrenc and Ayala (1991) as a set of lineages that 
appears identical strictly according to a specified set of genetic markers. 
Another is “discrete typing units,” defined by Tibayrenc (1998) as recog- 
nizably stable genetic subdivisions of a species as assessed from total evo- 
lutionary evidence. 
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TABLE 5.4 Population genetic evidence for significant clonal reproduction in 
l several parasitic protozoans and fungi that are important agents 


of human disease | 
Criterion" Evidence for 
Organism (à O O (d clonality” 
Protozoans 
Entamoeba histolytica 0 0 + 0 Moderate 
; Giardia spp. 0 0 * + Moderate 
Leishmania guyanensis 0 0 + + Strong 
Leishmania infantum 0 0 + + Strong 
Leishmania tropica + 0 + + Strong 
Leishmania major 0 0 + 0 Strong 
Leishmania spp. 0 0 + + Strong 
Naegleria australiensis + 0 + 0 Weak 
Naegleria fowleri + 0 0 0 Weak 
Naegleria gruberi + 0 0 0 Weak 
Plasmodium falciparum 0 0 + 0 Weak 
Toxoplasma gondii 0 0 + 0 Weak 
Trichomonas foetus 0 0 + 0 Weak 
Trichomonas vaginalis 0 0 + 0 Weak 
Trypanosoma brucei + + + + Strong 
Trypanosoma congolense + 0 + 0 Moderate 
Trypanosoma cruzi + + + + Strong 
Trypanosoma vivax 0 0 + 0 Strong 
Fungi 
Candida tropicalis complex 0 0 + 0 Weak 
Candida albicans + + + + Strong 
Cryptococcus neoformans 0 0 + 0 Weak 
Saccharomyces cerevisiae 0 0 + 0 Weak 





Source: Expanded from Tibayrenc et al. 1991b; Tibayrenc and Ayala 2002. 

" Criteria for clonality are described in Box 5.2. +, criterion is satisfied; 0, data not available. 
Even where clonal reproduction has been firmly documented, this does not necessarily 
exclude occasional or even frequent sexual reproduction, which often has been evidenced as 
well. 


? Overall weight of available molecular evidence. Data in various taxa have come from 
allozymes, RAPDs, microsatellites, gene sequences, or other molecular markers. 


FUNGI. Molecular markers have also been used to address questions about 
modes of asexual and sexual reproduction in several microbial fungi, 
including pathogenic forms (see Table 5.4). In one early study, Newton et al. 
(1985) employed allozymes and cytoplasmically transmitted RNAs (from 
mycoviruses inside the fungal cells) to assay numerous accessions of the 
cereal rust Puccinia striiformis from around the world. Representatives of one 
widespread group of wheat-attacking forms (P. s. tritici) proved completely 
uniform, whereas some other related species showed much higher levels of 
genetic variation. The cereal rust has no known sexual stage, so the genetic 
results are probably attributable to prevalent clonal transmission. 











Candida albicans is a commensal diploid yeast normally inhabiting human 
mucosal epithelia, but in immunocompromised patients it can become a 
lethal pathogen. Traditionally this species was thought to be asexual, but its 
genome was recently found to possess a fungal mating-type-like (MTL) gene 
(Hull and Johnson 1999), and cell fusions have been observed (Hull et al. 2000; 
Magee and Magee 2000) raising the possibility that sexual reproduction 
might be a normal feature of its poorly known life cycle. To address these 
issues, various molecular markers, including allozymes, RAPDs, RFLPs, 
microsatellites and others, have been employed. Results indicate that the pop- 
ulation genetics of this species is characterized by a mixture of occasional 
recombination and extensive clonality (Fundyga et al. 2002; Graser et al. 1996; 
Pujol et al. 1993; Xu et al. 1999). Apart from contributing to knowledge of basic 
biology in this species, these molecular markers have also found clinical 
application in distinguishing various Candida strains and species (Boerlin et 
al. 1996; Mannarelli and Kurtzman 1998; Pinjon et al. 1998). 
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BACTERIA. Following the discoveries in the mid-twentieth century of con- 
jugation, transformation, and transduction in laboratory strains of Escherichia 
coli, a notion developed that recombinational genetic exchange might be 
prevalent in bacterial taxa in the wild (Hedges 1972). However, traditional 
research into bacterial systematics, epidemiology, and pathogenicity 
involved analyses of gross phenotypes (physiological, serological, etc.) that 
seldom could be tied to specific alleles or loci. Thus, reliable assessments of 
population genetic structure and reproductive mode remained elusive 
(Selander et al. 1987a). This situation started to change with protein elec- 
trophoretic surveys beginning in the early 1970s (Milkman 1973, 1975). Initial 
results from population analyses of allozyme markers suggested that, in 
addition to various mechanisms for genetic exchange, clonal reproduction is 
an important element underlying the genetic structure in natural popula- 
tions of many bacterial taxa. Most bacteria are haploid, so the evidence for 
clonality usually consists primarily of criteria (c) and (d) in Box 5.2. 

For example, early work on E. coli revealed that despite high allozyme 
variation (H = 0.50—an order of magnitude greater than values typifying 
higher eukaryotes), the number of distinctive protein electrophoretic pro- 
files was unexpectedly constrained. Furthermore, presumptive clonal line- 
ages (those identical or closely similar in multi-locus allozyme genotype) 
were sometimes observed even in samples from geographically wide- 
spread, unassociated hosts (Ochman and Selander 1984; Selander and Levin 
1980; Whittam et al. 1983a,b). Such findings of strong disequilibrium across 
loci (see Box 5.3) soon led to a view that chromosomal transmission in E. coli 
is basically clonal, albeit with occasional exchanges of sequence causing par- 
tial reticulation among lineages (Milkman and Bridges 1990; Milkman and 
Stoltzfus 1988). These conclusions applied strictly to chromosomal DNA; 
extrachromosomal sequences (notably plasmids, which are often of adap- 
tive relevance to the bacterium) did appear to be commonly exchanged 
among E. coli strains (Hartl and Dykhuizen 1984; Valdés and Pifiero 1992). 
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This new paradigm favoring frequent clonal inheritance in E. coli car- 
ried several biological ramifications. Within an individual host, the presence 
of multiple clones might reflect successful invasions of independent 
founder lineages (Caugant et al. 1981), rather than novel genotypes arising 
solely via recombination among a few pioneering strains. Conversely, some 
specifiable clonal lineages could be distributed widely, perhaps even glob- 
ally. It also became somewhat easier to deduce the phylogeny of various 
strains of E. coli and its suspected relatives, such as Shigella (Whittam et al. 
1983b). Furthermore, to the extent that recombination in E. coli was less fre- 
quent than formerly imagined, its genome would constitute a low-recombi- 
nation genetic system subject to some special evolutionary dynamic forces 
(see Box 5.3). Notable among these forces is “periodic selection" (sequential 
replacement of clonal lineages by genotypes displaying higher fitness), 
which can have the net effect of truncating overall population genetic vari- 
ability both in neutral markers and in the selected genes to which they are 
linked. 

Similar evidence for clonal chromosomal inheritance in other bacterial 
species, including pathogenic forms, soon led in the 1980s to a molecular 
genetic revolution in bacterial taxonomy and epidemiology (Selander and 
Musser 1990). The following discoveries are just a few examples (Selander et 
al. 1987b). Based on the distinctiveness of sets of clones earmarked by more 
than 50 different multi-locus allozyme genotypes, Legionella was shown to 
consist of two distinct species that had masqueraded as L. pneumophila 
(Selander et al. 1985). Some of the electrophoretic types (ETs) occurred world- 
wide, and one particular lineage caused Legionnaires' disease and Pontiac 
fever. In Bordetella, a pathogen responsible for a variety of respiratory dis- 
eases in animals and whooping cough in humans, numerous clones and sev- 
eral genetically distinct species were shown to exhibit strong host specifici- 
ties (Musser et al. 1987): clone ET-1 of B. bronchiseptica is a pig specialist, ET- 
6 is a dog specialist, and the named taxa B. parapertussis and B. pertussis are ! 
other clonal forms of B. bronchiseptica that have become specialized as human i 
pathogens. In Haemophilus influenzae, certain clones were found to be distrib- 
uted worldwide, and one distinctive clonal group (ET-91-94) was pinpointed 
as the cause of meningitis and septicemia in human neonates. Another clone 
(ET-1) was found to have increased greatly in frequency in the United States 
between 1939 and 1954, and by 1990 caused about 30%—40% of all disease 
cases, whereas other Haemophilus clonal groups exhibited no clear association 
with particular disease conditions (Musser et al. 1985, 1986). In Neisseria 
meningitidis, high clonal variation was found using molecular markers, but 
only a few among the hundreds of multi-locus genotypes may have been 
responsible for most major pandemics in the twentieth century. For example, 
an epidemic that started in Norway in the mid-1970s and spread through 
Europe was caused by one group of clones in the ET-5 complex (Caugant et 
al. 1986). So too was another severe epidemic that appeared in the late 1970s 
in Cuba, from which it was spread by Cuban refugees to Miami, where 
another outbreak was initiated in 1980-1981. 
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These and other pioneering studies of clonal structure in bacteria exem- 
plified the power of molecular markers in addressing evolutionary prob- 
lems of diagnostic and epidemiologic relevance. On the other hand, not all 
bacterial taxa proved to be predominantly clonal. Neisseria gonorrhoene 
appeared to have rampant genetic exchange, as judged by random associa- 
tions of alleles among loci (Maynard Smith et al. 1993). One collection of 
wild Bacillus subtilis displayed a great diversity of allozyme and RFLP geno- 
types, leading Istock et al. (1992) to conclude that recombination must be 
frequent in this species also, perhaps due to its proclivity for spontaneous 
transformation, whereby DNA is exchanged by cell-to-cell contact. Such 
examples notwithstanding, population genetic data generally indicated that 
recombination events and gene flow within many bacterial species were far 
too rare to produce random allelic associations throughout a taxonomic 
species. However, even rare recombination can be of huge evolutionary sig- 
nificance if it occasionally generates high-fitness genotypes that then for a 
time are mostly clonally propagated. Indeed, for many bacterial taxa, direct 
evidence for chromosomal recombination accrued from allozymes and 
other molecular markers (DuBose et al. 1988; Selander et al. 1991a). Thus, 
any particular bacterial strain probably has its chromosomes derived in bits 
and pieces from multiple ancestors, the degree of historical heterogeneity 
depending on such factors as ancestral remoteness, the magnitude of geo- 
graphic population structure due to restricted gene flow, the closeness of 
physical linkage of the genes involved, and, of course, the primary repro- 
ductive mode, which also significantly affects the number of past recombi- 
national events (Hartl and Dykhuizen 1984). 

A second revolution in the study of pathogenic bacteria, beginning most- 
ly in the 1990s, accompanied the enormous influx of information from DNA 
sequencing. Genomes of hundreds of bacterial taxa have now been fully 
sequenced. A powerful new approach in evolutionary medicine is to analyze 
multiple strains within a disease-causing species to establish the historical 
origins of virulence and the molecular basis of pathogenesis (Fitzgerald et al. 
2001; see reviews in Fitzgerald and Musser 2001; Whittam and Bumbaugh 
2002). For example, by phylogenetically analyzing DNA sequences from 
multiple pathogenic strains of E. coli, Reid et al. (2000) demonstrated the fol- 
lowing: recombination had not completely obscured the chromosomes' 
ancestral conditions; an evolutionary diversification of virulent clones prob- 
ably began about 9 million years ago; a virulent strain (O157:H7) responsible 
for epidemics of food poisoning originally separated from a common ances- 
tor of E. coli type K-12 as long as 4.5 million years ago; and some old lineag- 
es of E. coli apparently acquired the same virulence factors in parallel. 

The virulence factors that can convert harmless E. coli to pathogenic 
forms include gene sequences encoded on mobile genetic elements such as 
plasmids and bacteriophages, as well as on distinct "pathogenicity islands" 
integrated into the bacterial chromosome (McGraw et al. 1999). Some of these 
pathogenicity islands have been well characterized and may involve, for 
example, genes encoding outer membrane proteins that mediate intimate 
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attachment of bacteria to eukaryotic cells. Various islands may themselves be 
legacies of past integrations of nucleic acids following horizontal gene trans- 
fer events (see Chapter 8) between otherwise distinct bacterial strains. The 
net result is that a given bacterial clonal lineage is really a historical mosaic 
of genes with different evolutionary sources. 


Genetic chimeras 


A genetic chimera is an "individual" composed of a mixture of genetically 
different cells—that is, cells typically stemming from separate zygotes. The 
phenomenon is quite rare in the biological world, having been documented 
only in miscellaneous protists, plants, and animals representing about ten 
phyla (Buss 1982). Far more normally in nature, each multicellular individ- 
ual is composed of genetically identical cells all tracing back asexually 
through mitotic cell divisions to one fertilized egg (Grosberg and 
Strathmann 1998; Michod 1999). This makes evolutionary sense, because 
multicellularity is the ultimate expression of inter-cell collaboration attrib- 
utable to kin selection stemming from high genetic relatedness (see Chapter 
6). Normally, close kinship is prerequisite for the extraordinary levels of 
cooperation and self-sacrifice displayed by somatic cells, which toil on 
behalf of their potentially immortal germ line kin without prospect of self- 
perpetuation per se (Maynard Smith and Szathmáry 1995; Queller 2000). 
From similar logic, true genetic chimeras are of special evolutionary inter- 
est: Why would unrelated cells ever join collaborative forces to constitute a 
multicellular individual? 

Proper identification of genetic chimeras is important for several other 
reasons. First, cross-pollinations in plants (or cross-fertilizations in animals) 
might occur between genotypes within a chimeric individual, thus influ- 
encing the genotypes of progeny and the perceived mating system of a 
species. Second, if chimeras are common in particular species, the number 
of genets in a population could be seriously underestimated by a mere cen- 
sus of ramet numbers, with consequences extending to any parameters that 
are influenced by effective population size (such as expected magnitudes of 
genetic drift). The occurrence of genetic chimeras also raises important 
issues regarding the degree of physiological and functional integration of 
composite individuals. 


MICROBES AND PLANTS. One of the best-studied chimeras—the slime mold 
or social amoeba, Dictyostelium discoideum—occurs at one stage in the crea- 
ture’s life cycle. Individual free-living cells inhabit the forest floor, where 
they consume bacteria, but when the food supply becomes scarce, these 
haploid cells gather together to form a sluglike amoeboid structure that 
migrates to a more favorable spot and then forms a fruiting body that releas- 
es spores. Studies based on microsatellite markers have shown unequivo- 
cally that the amoeboid form often consists of genetically distinct cells and, 
hence, is a genuine chimera (Fortunato et al. 2003; Strassmann et al. 2000b). 
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Additional analyses of Dictyostelium have shown one potential advantage of 
unlike cells joining forces: Amoeboid aggregations with many cells (as 
opposed to fewer, as would be the case for any amoeba that excluded genet- 
ically different cells) can travel farther, thereby enhancing the aggregation’s 
overall prospects for survival and reproduction in times of food scarcity 
(Foster et al. 2002; Queller et al. 2003). 

Chimeras occur rarely in diploid metazoans as well. In plants, what 
appears to be a single ramet occasionally proves to be two or more genets 
that have fused into one morphologically (and perhaps physiologically) 
integrated module. A case in point involves strangler figs (Ficus spp.). These 
trees typically begin growth when bird-deposited or mammal-deposited 
seeds germinate in humus-filled crotches of a host tree. Shoots grow 
upward and roots downward around the host. The roots eventually cross 
and fuse to form a unified woody sheath around the host, which then may 
die, so that only the fig tree remains. Thompson et al. (1991) showed by 
allozyme analyses that “individual” fig trees often were genetic chimeras 
consisting of multiple genotypes. Thirteen of 14 sampled trees showed 
detectable genetic differences among branches, such that at least 45 genetic 
individuals were represented altogether. Presumably, the chimerism was 
attributable to post-germination fusions among roots that traced to multiple 
seeds deposited in the host tree. 


INVERTEBRATE AND VERTEBRATE ANIMALS. Fusions among ramets are com- 
mon in many marine invertebrates, including sponges, cnidarians, bry- 
ozoans, and colonial ascidians (Grosberg 1988; Jackson 1985). These somat- 
ic fusions may involve recently settled larvae (Hidaka et al. 1997) or mature 
colonies (Neigel and Avise 1983a), with the participants normally being 
asexual products of a single genet. However, in some situations, the fusing 
entities are known or suspected to be sexually produced siblings or more 
distant kin, thus generating genetic chimeras (Barki et al. 2002; Maldonado 
1998). For example, although fusions among clonemates are normal in the 
colonial hydroid Hydractinia symbiolongicarpus, they also occur occasionally 
between sexually produced full-sib and even half-sib pairs (Grosberg et al. 
1996; Hart and Grosberg 1999). 

The ascidian Botryllus schlosseri likewise can form chimeric colonies 
typically involving close relatives (Stoner and Weissman 1996). Micro- 
satellite assays have been used to identify genetically different cells and 
monitor their fates in laboratory settings (Pancer et al. 1995; Stoner et al. 
1999). Another ascidian (Diplosoma listerianum) has been documented by 
molecular markers to have rampant chimerism, sometimes even involving 
several distinct genotypes amalgamated from unrelated individuals 
(Bishop and Sommerfeldt 1999; Sommerfeldt and Bishop 1999) The 
authors speculate that chimerism is favored in this species because the 
phenomenon produces large colonies that survive and reproduce better 
than small ones (an explanation that closely parallels the documented 
advantages discussed above for chimerism in Dictyostelium slime molds). 
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In some marine invertebrate species, aggregation and fusion among kin 
may often be facilitated by ecological factors such as co-settlement of non- 
dispersive larvae. Even then, however, chimerism remains the exception 
rather than the rule because cell rejection responses mediated by polymor- 
phic allorecognition genes typically are highly effective (Grosberg and Hart 
2000; Grosberg and Quinn 1986). 

Chimerism is known even in vertebrate animals, most notably in mar- 
mosets and tamarins (Benirschke et al. 1962; Haig 1999). These primates nor- 
mally give birth to two fraternal (non-identical, or dizygotic) twins per preg- 
nancy. In the first month of pregnancy, however, the tiny embryos partially 
fuse for a time inside the uterus, exchanging blood and some other body 
cells. Although the fetuses physically separate again before birth, molecular 
fingerprinting assays (of the common marmoset, Callithrix jacchus} have 
confirmed that each individual continues to be a chimera of its own blood 
cells plus those from its genetically distinct sibling (Signer et al. 2000). 

In a quite different sense, the cells of all eukaryotic organisms can be 
viewed as genetic chimeras consisting of amalgamations of nuclear and 
organelle genomes that had independent evolutionary histories prior to their 
ancient endosymbiotic mergers (Margulis 1970). This topic will be deferred 
to Chapter 8. 


Gender Ascertainment 


Ascertainment of an individual's sex can prove difficult in many situations, 
such as early life history stages, species with little dimorphism in secondary 
sexual characters, or species with internal gonads (such as birds). Yet knowl- 
edge of sexual identity is crucial in many ethological studies, in estimation 
of population sex ratios, in management of matings among endangered cap- 
tive animals (e.g., Millar et al. 1997), and in several other areas of population 
biology. In some taxonomic groups, such as reptiles, sex is often influenced 
by the temperature at which eggs are incubated (Bull 1980; Shine 1999), but 
gender in most other taxa is genetically "hard-wired." For these latter 
species, molecular assays of gender-associated ("sex-linked") DNA markers 
provide a powerful approach to sex identification at any stage of life 
(Griffiths 2000). 

A flurry of such pioneering studies, especially on avian species, began 
to appear by the early 1990s. In birds, females are the heterogametic gender, 
possessing ZW sex chromosomes in contrast to the male ZZ condition. 
Thus, W-specific molecular markers have been a prime target for sex-typing. 
For example, Quinn et al. (1990) isolated a segment of DNA homologous to 
the W (female-specific) chromosome of the snow goose (Chen caerulescens) 
and employed this molecular probe to determine the sex of more than 150 
birds, using blood samples. Similarly, Griffiths and Holland (1990) isolated 
a W-linked repetitive DNA marker for the herring gull (Larus argentatus), as 
did Rabenold et al. (1991) for stripe-backed wrens (Campylorhynchus 
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nuchalis). In DNA fingerprinting assays, Millar et al. (1992) uncovered W- 
specific bands in the brown skua (Catharacta lonnbergi) and used them to 
document significantly different sex ratios in adults versus chicks. 

Most of these early genetic screens involved Southern blotting tech- 
niques (so that fairly large tissue samples were needed) and were based on 
rapidly evolving repetitive sequences (so that the W-linked markers were of 
limited taxonomic range). A possibility also existed that occasional cross- 
homology between the Z and W chromosomes (as is known for portions of 
the mammalian X and Y; Burgoyne 1986; Ohno 1967; Page et al. 1982, 1984) 
might compromise some of these assays. Thus, later approaches moved 
toward PCR-based assays (as in RAPDs; Lessells and Mateman 1998) and 
toward assays of slower-evolving and better-characterized W-linked genes 
that might have broader taxonomic scope. The W chromosome of birds (like 
the Y chromosome of mammals) is small and carries few functional loci, but 
Ellegren (1996) did identify one useful W-linked gene (CHD-W) that pro- 
vided a near-universal tag for avian sexing (see also Griffiths et al. 1996, 
1998; Huynen et al. 2002). Within the last decade, DNA sex-typing based on 
this and other molecular marker systems has become common practice in 
avian behavioral ecology (Komdeur et al. 1997; Millar et al. 1996; Westerdahl 
et al. 1997), conservation efforts (Double and Olsen 1997; Robertson et al. 
2000), and related endeavors (Ellegren and Sheldon 1997). 

In mammals, in which males are the heterogametic sex, analogous sex- 
typing methods have been developed based on Y chromosome markers 
(Fernando and Melnick 2001). For example, Sinclair et al. (1990) discovered 
a Y-specific probe that could be used in Southern blot analyses to identify 
each individual's sex in a wide range of mammal species. A remarkable 
early application of this approach involved humpback whales (Megaptera 
novaeangliae), which, like other baleen whales, lack obvious secondary sexu- 
al characteristics. A human Y chromosome sequence was employed as a 
hybridization probe in RFLP analyses to determine the gender of 72 free- 
ranging humpbacks from which skin biopsies had been collected by special 
dart (C. S. Baker et al. 1991). The sex of another individual was DNA-typed 
from sloughed skin collected from the whale's swimming path. More recent- 
ly, cetaceans have also been sexed using PCR-based assays with primers for 
sex-specific ZFY and ZFX gene sequences (Bérubé and Pasbell 1996). Wild 
brown bears (Ursus arctos) have similarly been sexed by Y-linked molecular 
markers, using DNA extracted from shed hairs (Taberlet et al. 1993, 1997). 

One important application for sex-linked molecular markers is in esti- 
mating sex ratios at early developmental stages for comparison against 
adult sex ratios. An example involves the Japanese frog Rana rugosa, in 
which females are the heterogametic sex. Newly fertilized eggs of this 
species, sampled in the field throughout the summer months, were assayed 
for gender using a PCR-amplified sex-specific marker (Sakisaka et al. 2000). 
The researchers discovered a male-biased primary sex ratio (i.e., the sex 
ratio at or near conception) early in the reproductive season but a female- 
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biased sex ratio later on, a result that they interpreted as indicative that 
adults might somehow be able to influence sex ratios among their progeny. 
Another application for sex-linked molecular markers is in establishing 
the mode of sex determination in taxa for which it is otherwise unknown. 
An example involves gastropod mollusks in the genera Busycon and 
Busycotypus, which formerly were suspected to be protandrous (male-first) 
hermaphrodites. However, the fortuitous discovery of a sex-associated 
microsatellite marker with a transmission pattern analogous to that of X- 
linked genes in mammals strongly indicates that these whelks normally 
have separate sexes (Avise et al. 2004). This sex-linked microsatellite mark- 
er was then used to estimate near-primary sex ratios in brooded cohorts of 
tiny whelk embryos, for which it was otherwise impossible to assign gender. 
Remarkably, a few plants possess sex chromosome systems somewhat 
analogous to those of vertebrates and various other animals. For example, 
in the perennial dioecious weed Rumex acetosa, females are XX and males are 
XYY, so it seemed puzzling that sex ratios of flowering adults were 
female-biased. Surveys of DNA markers from both Y chromosomes resolved 
the quandary by permitting the ascertainment of sex and the estimation of 
sex ratios in seeds (Korpelainen 2002). It turned out that the primary sex 
ratio was about 1:1 in the total seed pool, and that the female-biased adult 
sex ratio resulted in part from higher male mortality during development. 


Genetic Parentage 


Molecular procedures for genetic assessment of parentage are similar in 
principle to those used to assess genetic identity versus non-identity, but 
with the added complication that rules of Mendelian transmission genetics 
must be taken into account when comparing the genotypes of sexually pro- 
duced progeny against those of putative parents. Parentage analyses often 
address some version of the following question: Are the adults who are 
associated behaviorally or spatially with particular young the true biologi- 
cal parents of the offspring in question? If the answer proves to be no, a 
genetic exclusion has been achieved (Box 5.4). Whether the actual mothers 
and fathers also can be specified depends on the size and genetic composi- 
tion of the pool of candidate parents and on the level of genetic variability 
monitored. Sometimes one biological parent is known from independent 
evidence and the problem simplifies to one of paternity (or maternity) 
exclusion or inclusion. In other cases, neither parent is known with certain- 
ty prior to the molecular study. 

Parentage analyses utilize cumulative information from multiple poly- 
morphic loci, scored either collectively, as complex DNA banding patterns 
on a gel (often in minisatellite DNA fingerprinting), or one at a time (e.g., by 
assays of allozymes or nuclear RFLPs), with the data tallied as discrete 
Mendelian genotypes accumulated across loci. In recent years, microsatellite 
assays have largely supplanted earlier methods of parentage assessment, 
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especially in analyses of vertebrate animals, in which (unlike in many 
plants, for example) allozyme variation often tends to be insufficient to pro- 
vide high exclusionary power. Even a few hypervariable microsatellite loci 
in a population often display sufficient genetic variation to yield combined 
exclusion probabilities well above 0.99, thereby offering exquisite informa- 
tion on biological paternity and maternity. 

Various types of interpretive logic regarding parentage can be introduced 
by the following empirical examples, each a classic from the early literature: 


1. Maternity and paternity both uncertain, exclusions attempted. T. W. Quinn et 
al. (1987, 1989) compared goslings within each of several broods of 
snow geese (Chen caerulescens) against their adult male and female nest 
attendants (putative parents) using genetic markers from multiple 


BOX 5.4 Genetic Exclusions and Parentage Analyses 


Using Mendelian molecular markers, estimates of genetic maternity or genetic 
paternity can be achieved by excluding as parents all adults whose genotypes 
are incompatible with those of the juveniles under consideration. Associated 
with such “genetic exclusions” are statistical probabilities that are a joint function 
of the variability of the genetic markers emplayed and the biological nature of 
the particular parentage problem (e.g:, perhaps one of the two parents is known 
from independent evidence, such as pregnancy). 

Exelusion probabilities may be either specific or average. Consider a neutral 
autosomal locus with two equally frequent alleles (A and B). Ina large popula- 
tion at Hardy-Weinberg equilibrium, about 25% of all individuals would be 
homozygous AA, and another 25% homozygous BB. Suppose that molecular 
markers show that an AA mother has an AA offspring. All adult males in the 
population with genotype BB can be excluded as the youngster’s biological 
father (barring mutation), so the specific exclusion probability in this case is.0.25, 
An average exclusion probability, by contrast, is the mean probability (or the 
expected proportion) of excluded parents for randomly chosen juveniles. A mean 
exclusion probability may be higher than some specific exclusion values and 
lower than others because it is calculated by combining all specific exclusion 
probabilities weighted by the expected frequency of each parent-offspring pair 
in the population. Biologists are often particularly interested in average exclusion 
probabilities because they indicate the strength of available genetic markers for 
parentage exclusions (values above 0.99 typically are sought) and because these 
are useful for comparing statistical power across published studies. 

The first formulae for calculating average exclusion probabilities were 
published early in the twentieth century (e.g., Weiner et al: 1930), but later 
methods generalized and extended the underlying models, of which two are 
most common: the “one-parent-known” case, in which independent evidence 
provides secure knowledge of either the mother or the father (Dodds et al. 
1996; Jamieson 1965; Jamieson and Taylor 1997; Weir 1996), and the "unknown 
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parentage" case, in which neither parent is certain from extrinsic evidence 
(Crawford et al. 1993; Garber and Morris 1983). Formulae for single-locus 
mean exclusion probabilities under these respective models are as follows, 
where p; is the frequency of the ith codominant allele at an autosomal locus: 
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Mean exclusiori values calculated for individual loci can then be combined 
across K independent marker loci into a total average exclusion probability for a 
given study (Boyd 1954): 1 
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The models described above typically assume that samples were taken 
from a randomly mating deme, but if biological phenomena such as popula- 
tion structure, philopatry, or inbreeding apply, these equations can artificially 
inflate exclusion probabilities relative to their true values (Double et al. 1997). 
` Many other statistical tasks associated with various nuances of parentage 
assessment have been developed (see review in Jones and Ardren 2003). For 
specified biological settings, maximum likelihood apptoaches are available to 
categorically assign individuals to their parents (Coltman et al. 1998b) orto 
fractionally assign parentage to multiple non-excluded adults (Smouse and 
Meagher 1994). Computer programs used to implement these or other methods . 
include CERVUS (Marshall et al. 1998), FAMOZ.(Marshall et al. 1998), KIN- 
SHIP (Goodnight and Queller 1999), PAPA (Duchesne et al. 2002), PARENTE 
(Cercueil et al. 2002), PATRI (Signorovitch and Nielsen 2002), and PROBMAX 
(Danzmann 1997). Some statistical approaches are targeted toward quite specif- 
ic problems. For example, in many fishes and other species with hundreds or 
thousands of offspring in a clutch, methods of statistical correction (e.g., for . 
finite marker data or incomplete sampling of candidate individuals in the pop- 


; ulation) have been devised to refine estimates of multiple mating and the mean 


number of reproductive adults contributing to a half-sib progeny array 
(DeWoody et al. 2000a; Fiumera et al. 2001; Jones 2001; Neff and Pitcher 2002), 
the proportion of next-generation offspring sired by a focal male (Fiumera et al. 
2002a; Neff et al. 2000a,b, 2002), and the proportion of broods with at least two 
contributing members of each adult sex (Neff et al. 2002). 
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scnRFLP loci. From their data (Table 5.5), the following observations 
and deductions were made. Two goslings in family 2 (numbers 7 and 8) 
proved to be homozygous at some loci (E and M) for alleles not present 
in the female attendant. Such cases excluded the putative mother and 
were interpreted to reveal instances of intraspecific brood parasitism 
(IBP), or. “egg-dumping” (Petrie and Meller 1991), whereby other 
females (not assayed) must have contributed eggs to the nest. Other 
goslings (e.g., number 6 in family 2) proved to be homozygous (locus 
M) for alleles not present in the male attendant. Such cases excluded the 
putative father and were interpreted to reveal likely instances of extra- 
pair fertilization (EPF) by other males. Some heterozygous loci (e.g., J in 
gosling 5, family 2) exhibited one allele not observed in either nest 
attendant and a second allele present in both attendants. Such loci 
exclude one of the putative parents, but do not alone determine which 
attendant is disallowed. Finally, some loci (e.g., C in gosling 4, family 1) 
were homozygous for alleles not observed in either nest attendant, thus 
excluding both. Overall, the genetic markers revealed that otherwise 
cryptic IBP and EPF behavioral events must be relatively common in 
snow goose populations (see also Lank et al. 1989). 


2. Maternity known, paternity to be decided among a few candidate males. Burke 
et al. (1989) applied multi-locus DNA fingerprint assays to the dunnock 
sparrow (Prunella modularis), a species with a mating system in which 
two males often mate with a single female and defend her territory 
(Davies 1992). In the DNA fingerprints, those bands in progeny that 
could not have been inherited from the known mother were identified as 
paternally derived. Then, the true father was determined by comparing 
bands from the fingerprints of candidate sires against these paternal alle- 
les in progeny. Figure 5.6 shows DNA fingerprints from one known 
mother (M), her four offspring (D-G), and two candidate sires (Pa and 
P). In this family, the genetic data demonstrate that progeny G was 
sired by Pa, whereas D, E, and F were fathered by PB. Thus, molecular 
data confirmed that individual dunnock broods can be multiply sired. 


3. One parent or two? Many plants and invertebrate animals are hermaph- 
roditic; that is, an individual produces both male and female gametes. 
Such individuals may self-fertilize (in which case offspring have a sin- 
gle parent), or matings with other individuals may be facultative or 
compulsory (producing two-parent progeny). For wild-caught females 
whose mating habits are in question, genetic examination of progeny 
can reveal whether some of them carry alleles that are not present in the 
mother and, hence, derive from cross-fertilizations. Furthermore, com- 
parisons of population genotypic frequencies against Hardy-Weinberg 
expectations can aid in deciding whether cross-fertilization or self-fertil- 
ization predominates at the population level (because selfing is an 
intense form of inbreeding whose continuance leads to pronounced 
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heterozygote deficits). In examples of these approaches, allozyme data 
were employed to show that cross-fertilization is the prevailing mode of 
reproduction in several freshwater species of hermaphroditic snails in 
the genera Bulinus and Biomphalaria (Rollinson 1986; Vrijenhoek and 
Graven 1992; Woodruff et al. 1985), that intermediate levels of self-fertil- 
ization characterize the Florida tree snail Liguus fasciatus (Hillis 1989) 
and the coral Goniastrea favulus (Stoddart et al. 1988), and that self-fertil- 
ization predominates in populations of the sea anemone Epiactis prolifera 
(Bucklin et al. 1984). In allozyme studies of 19 species of terrestrial slugs 
in the families Limacidae and Arionidae, most of the taxa were shown 
to be predominant outcrossers (Foltz et al. 1982, 1984). 





Family 1 
Male attendant 22 22 23 12 11 11 
Female attendant 22 22 22 11 11 11 
Gosling 1 22 22 22 12 1,1 11 
Gosling 2 22 22 22 11 11 11 
Gosling 3 22 22 22 12 11 11 
Gosling 4 2,34 22 114 11 11 11 

Family 2 
Male attendant 1,2 242 2,4 1,1 1,1 1,1 
Female attendant 2,2 2,2 1,2 LI 22 11 
Gosling 5 2,2 22 12 11 12 1,1 
Gosling 6 2,2 2,2 2,4 11 12 11 
Gosling 7 12 22 2A 11 11° 11 
Gosling 8 2,34 2,2 2,2 11 11^ 11 
Gosling 9 12 22 22 11 12 11 

Family 3 
Male attendant 22 22 33 12 11 11 
Female attendant 22 12 11 11 11 141 
Gosling 10 2,2 12 13 12 11 11 
Gosling 11 22 12 13 12 11 11 
Gosling 12 22 22 13 12 11 11 
Gosling 13 2,2 22 13 12 11 1,1 





Source: After T. W. Quinn et al. 1987. 

Notes: Letters are loci; numbers are allelic designations. Some goslings in families 1 and 2 
(boldface) show genetic evidence of EPF or IBP (see text). 

" Excludes one unspecified parent; ‘excludes putative mother; “excludes putative father; 
4excludes both putative parents. 
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4. Self-fertilization or parthenogenesis? As mentioned earlier, another form of 
uniparental reproduction is parthenogenesis, which can be experimen- 
tally distinguished from self-fertilization by examining diploid geno- 
types among offspring of a heterozygous parent. Fixed heterozygosity 
among progeny is inconsistent with expectations of Mendelian segrega- 
tion under self-fertilization, but it is a hallmark of ameiotic partheno- 
genesis. Using allozyme markers in this context, Hoffman (1983) docu- 
mented that a laboratory population of a slug species (Deroceras laeve) 
formerly suspected of self-fertilization actually reproduced partheno- 
genetically. Further discussion of parentage in the context of partheno- 
genetic reproduction will be deferred to Chapter 7. 





11 12 11 11 11 12 11 12 
12 12 22 1,1 11 11 22 12 
11 22 12 1,24 1,1 11 12 12 
12 11 12 11 11 11 2,2* 11 
1,1 1,2 2,2* 11 11 1,1 1,18 11 
2,2 11 12 1,1 11 11 11^ 22 
12 22 12 11 11 12 12 12 
11 12 12 11 11 1,1 11 12 
11 22 11 11 12 12 12 12 
11 12 12 11 12 1,1 12 12 
11 12 12 11 11 12 11 12 
11 22 12 11 12 11 12 12 
11 22 12 11 L1 12 L1 22 
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Figure 5.6 Parentage analysis by multi-locus DNA fingerprinting. Shown is a 
gel designed to assess whether each of four offspring (D-G) with known mother M 
was sired by P, or Py. Boxes encompass paternally derived bands in progeny that 
permitted choice between the two candidate sires. (After Burke et al. 1989.) 


Behavioral and evolutionary contexts 


Knowledge of biological parentage is important in many behavioral and eco- 
logical contexts. For example, matings are difficult to observe directly in 
nature for many species, but reproductive behaviors and patterns of gene 
flow (see Chapter 6) nonetheless can be deduced from molecular information 
on maternity and paternity. Proper interpretations of behavioral interactions 
between presumed family members depend on knowledge of genetic ties, 
including parentage. Even when matings can be readily observed, questions 
of genetic parentage remain of interest. In many birds and mammals, for 
example, copulations are known to occur outside the socially bonded pair, but 
until molecular markers were applied, the extent to which these matings 
resulted in illegitimate young remained uncertain, constituting a major defi- 
ciency in understanding of sexual selection and the ecology of mating systems 
(Fleischer 1996; Mock 1983; Trivers 1972). By revealing genetic parentage, 
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molecular data provide direct assessments of realized reproductive success 
and therefore largely circumvent the danger of equating mating prowess or 
other components of reproduction with actual gene transfer across genera- 
tions (Maller and Ninni 1998). Knowledge of biological parentage is also crit- 
ical for correct interpretation of the transmission genetics or heritabilities of 
morphological and other phenotypic characteristics as deduced from field 
data on presumed parent-offspring relationships (Alatalo et al. 1984). 
Typically, molecular data revealing genetic maternity and paternity 
within particular broods or clutches are accumulated across many families, 
such that the results collectively describe the “genetic mating system” (often 
quite different from the field-observed “social mating system”) of a popula- 
tion or species. Figure 5.7 provides definitions of various genetic mating 
systems. It also summarizes their oft-suspected relationships to the intensi- 
ty of sexual selection (resulting from differential abilities among individuals 
| of the two genders to acquire mates) and the degree of elaboration in each 
j sex of secondary sexual traits (those arising from sexual selection). For 
example, conventional wisdom holds that males in polygynous species are 
often under strong sexual selection and therefore display pronounced body 
adornments (e.g., large antlers in bull elk or flashy tails in peacocks) arising 
from intrasexual or intersexual competition for mates; whereas in polyan- 
drous species, it is females who are likely to be under intense sexual selec- 
tion and thus perhaps more ornamented with secondary sexual features. 


General intensity of sexual selection 








(Strong) (Weak) (Weak or variable) (Strong) 
Polygyny Monogamy Polygynandry Polyandry 
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Gender in which secondary sexual traits are most likely elaborated 


Figure5.7 Pictorial definitions of four possible genetic mating systems. Lines 
connecting males and females indicate mating partners that produce offspring. 
Also shown are theoretical gradients in sexual selection intensities and degrees of 
gender dimorphism in secondary sexual traits often thought to be associated with 
these genetic mating systems. (From Avise et al. 2002.) 
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ductive success. Thus, these traditional behavior-based methods of fitness 
estimation soon came to be appreciated as grossly inadequate predictors of 
successful progeny production in many primate societies. 

More recent studies exemplify how microsatellite markers are further 
contributing to knowledge of primate parentage and social behavior. In the 
orangutan (Pongo pygmaeus), classic adult males are huge and display sec- 
ondary sexual features including wide cheek pads (“flanges”) and a large 
throat sac for emitting loud calls. Other males do not develop these features 
to nearly the same extent, but instead are "developmentally arrested" for up 
to 20 years after reaching sexual maturity. Thus, this species shows a pro- 
nounced "bimaturism" among males, which researchers had thought must 
be due to a social environment wherein the presence of a classic adult male 
hormonally suppresses the full maturation of subordinate males in his 
vicinity. A somewhat different perspective on the topic has recently come to 
light (Maggioncalda and Sapolsky 2002). Observations on the animals' 
behaviors and hormone levels did not square with the notion that subordi- 
nate males are abnormally stressed. Furthermore, microsatellite analyses 
revealed that about 5096 of offspring in a Sumatran population were sired 
by unflanged males (Utami et al. 2002), who clearly force themselves upon 
(i.e., rape) females (a tactic used much less often by dominant males). These 
findings were interpreted to indicate that unflanged males are not patho- 
logical or debilitated specimens, but rather are employing a genetically suc- 
cessful "alternative reproductive tactic" to classic adult maleness. 

Another recent paternity analysis, on savanna baboons (Papio cyno- 
cephalus), uncovered perhaps the first genetic evidence in any wild species 
that fathers can distinguish their own from other males' offspring in polyg- 
amous multi-male, multi-female assemblages. By combining results from 
microsatellite paternity analyses with 30 years of observational data on wild 
baboons in Kenya's Amboseli Basin, Buchan et al. (2003) showed that in 
resolving fights among juveniles, adult males were significantly more likely 
to support their own biological offspring than they were to intervene on 
behalf of unrelated young. This apparent knowledge of paternity by baboon 
males might be due to direct cues (such as a juvenile's appearance or smell), 
indirect cues (e.g., a male might assess his paternity probability based on his 
frequency of past copulations with mothers of particular offspring), or both. 
Whatever the mechanism of kinship recognition, nepotistic males tend to 
behave as if cognizant of their biological paternity. Among other ramifica- 
tions, this genetic finding casts doubt on one conventional hypothesis for 
multiple mating by female baboons—that it confuses paternity within a 
troop and thereby serves either to enlist more adult males in offspring care 
or to reduce the risk of infanticide by unrelated adult males. 


BIRDS. Parentage analyses via molecular markers have revolutionized 
thought in avian sociobiology by documenting that individual broods fre- 
quently contain progeny from at least one biological parent other than the 
attendant care-givers (Birkhead and Moller 1992; Ligon 1999; Westneat et al. 
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1990). Conventional wisdom was that Passeriformes (perching birds), in 
particular, are among the most monogamous of organisms (Gill 1990; Lack 
1968). However, Gowaty and Karlin (1984; see also Gowaty and Bridges 
1991a; Meek et al. 1994) were among the first to report genetic evidence for 
high frequencies of multiple paternity (ca. 8%-35% illegitimate young) in a 
purportedly monogamous passeriform, the eastern bluebird (Sialia sialis). 
Since then, in more than 150 molecular studies encompassing about 130 
avian species and a total of more than 25,000 offspring (Griffith et al. 2002), 
many additional examples of multiple concurrent paternity have come to 
light, with foster young often found at high frequencies (Table 5.6). 


TABLE 5.6: Reprasentative ffequencies of: extraspait offs " ce (EPQsh detected by molecular 
à ee ` markers. in various avian species-:-; ae p3 s 





No. broods 96 broods ‘No. chicks % "EPO 


Species assayed with EPO? assayed chicks? Reference 

Bobolink, Dolichonyx 38 19 . 840 15 Bollinger and 
oryzivorous Gavin 1991 

Corn bunting, 15 7 44 5 Hartley et al. 
Miliaria calandra 1993 

Eastern kingbird, 19 47 60 30 McKitrick 1990 
Tyrannus tyrannus 

Eurasian dotterel, 22 5 44 5 Owens et al. 
Charadrius morinellus 1995 

Field sparrow, 17 41 52 19 Petter et al. 1990 
Spizella pusilla 

Hooded warbler, 17 47 78 29 Stutchbury 
Wilsonia pusilla et al. 1994 

House sparrow, 183 26 536 14 Wetton and 
Passer domesticus Parkin 1991 

House wren, 18 22 97 6 Price et al. 1989 
Troglodytes aedon 

Mallard duck, 46 17 298 3 Evarts and 
Anas platyrhynchos Williams 1987 

Northern cardinal, 16 19 37 14 Ritchison et al. 
Cardinalis cardinalis 1994 

Pied flycatcher, 22 14 131 20 Gelter and Tegel- 
Ficedula hypoleuca strém 1992 

Red-cockaded 28 4 48 1 Haig et al. 1994 
woodpecker, 
Picoides borealis 

White-crowned sparrow, 35 26 110 34 Sherman and 
Zonotrichia leucophrys Morton 1988 

White-fronted bee-eater, 65 — 97 10 Wrege and 
Merops bullockoides Emlen 1987 





Note: For details and more examples, see Gowaty 1996, and Westneat and Stewart (2003). 
* Often minimum estimates because of limited exclusionary power in the markers employed; see Westneat et 
al. 1987 for discussion of this problem. 
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For example, an allozyme survey of the indigo bunting (Passerina cyanea) 
established that at least 37 of 257 offspring (14%) carried genotypes incom- 
patible with the behaviorally suspected father (Westneat 1987). Statistical cor- 
rections that account for limited detection probabilities (because only a few 
polymorphic markers were employed) raised the estimated frequency of EPFs 
to as high as 42%. This latter estimate agreed with a DNA fingerprinting sur- 
vey of another indigo bunting population, in which 22 of 63 nestlings (3576) 
were shown to have resulted from extra-pair fertilizations (Westneat 1990). 

Not all genetic appraisals of passeriform species have produced evidence 
for extra-pair offspring (Griffith 2000). For example, among 176 offspring 
from 32 families of warblers in the genus Phylloscopus, no foster young were 
found by sensitive DNA fingerprint assays (Gyllensten et al. 1990). Likewise, 
nearly all 222 assayed juveniles of the bicolored wren (Campylorhynchus 
griseus) were produced by the primary mated pairs (Haydock et al. 1996), and 
the same proved true in microsatellite assays for nearly all of 139 scrub jay off- 
spring (Aphelocoma coerulescens) produced by 34 assayed adult pairs (Quinn et 
al. 1999). At the other end of the continuum, the current record for highest fre- 
quency of extra-pair offspring in the nest may belong to the superb fairy-wren 
(Malurus cyaneus), in which molecular markers have documented that more 
than 70% of offspring are sired by males other than the putative father 
(Double and Cockburn 2000; Mulder et al. 1994). 

Likewise, in many non-passeriform birds, molecular markers have often 
revealed multiple paternity or maternity within a clutch. An interesting 
example involved genetic documentation of communal nesting in the 
ostrich, Struthio camelus. From microsatellite assays of a population in 
Nairobi National Park, Kimwele and Graves (2003) discovered that only 
about 30% of all incubated eggs were genetically parented by both the resi- 
dent territorial male and female. Instead, all surveyed males had fertilized 
at least some eggs in the clutches of neighboring males, and every surveyed 
female had laid eggs not only in her own nest, but also in neighboring ones. 
Some genetic reappraisals of non-passerines have found no evidence for 
EPFs or IBPs, however. For example, DNA fingerprints of black vultures 
(Coragyps atratus) confirmed that this species is genetically as well as social- 
ly monogamous (Decker et al. 1993), as did comparable molecular data for 
Leach's storm-petrel (Oceanodroma leucorhoa; Mauck et al. 1995), Cory's 
shearwater (Calonectris diomedea; Swatschek et al. 1994), the crested penguin 
(Eudyptes pachyrhynchus; McLean et al. 2000), and the endangered New 
Zealand takahe (Porphyrio hochstetteri; Lettink et al. 2002). The broader point 
is that molecular appraisals of numerous avian species evidence not only 
the power of marker-based parentage analyses, but also the shortcomings of 
traditional field observations as secure guides to actual mating behaviors 
and genetic mating systems. 

Literature reviews have found no correlation between EPF rate and the 
nesting density or degree of coloniality in avian species (Westneat and 
Sherman 1997; Wink and Dyrcz 1999; but see also Møller and Birkhead 
1993a). Thus, other factors must be involved. These factors have been the 








Individuality and Parentage 209 


subject of many investigations that combine genetic assessments with 
behavioral or life history observations in the field (Petrie and Kempenaers 
1998). In one such example, the frequency of foster nestlings proved to be 
significantly greater in the broods of eastern bluebird males who were in 
their first breeding season, who were paired with females who frequently 
strayed from home territory during their fertile periods, and who exhibited 
mate-guarding behavior (a counterintuitive result, unless it is supposed that 
these males sense a propensity for cheating by their mates and monitor 
them accordingly). Gowaty and Bridges (1991b) interpreted some of these 
trends as consistent with the postulate that female bluebirds actively pursue 
EPFs, rather than receiving them passively or by coercion from EPF-seeking 
males (as might be assumed in traditional mating system theory). In theory, 
EPFs might be selectively advantageous to a female for any of several rea- 
sons: they might generate higher genetic diversity among her progeny 
(Foerster et al. 2003); they might afford her more opportunities to obtain 
"good genes" for her progeny (Hamilton 1990; Meller and Alatalo 1999) or 
higher genetic compatibility with a male who will sire her offspring 
(Kempenaers et al. 1999; Tregenza and Wedell 2000); they might afford a 
female enhanced access to male resources or services (see reviews in Burley 
and Parker 1998; Kokko et al. 2003; Meller 1998); and they might provide 
“fertilization insurance" (as demonstrated experimentally in Ficedula fly- 
catchers by Török et al. 2003; but see Olsson and Shine 1997 for a different 
outcome in a study involving lizards). Of course, extra-pair matings can 
have high costs as well, not least of which (for both sexes) is the danger of 
contracting sexually transmitted diseases (Kokko et al. 2002). 

In DNA fingerprinting studies of the great reed warbler (Acrocephalus 
arundinaceus), Hasselquist et ai. (1996) showed that females tend to obtain 
successful EPFs from neighboring males with larger song repertoires than 
their social mates, and that the offspring thereby produced also show high- 
er survival. These results were interpreted to support the hypothesis that by 
engaging in EPFs, females in effect are seeking genetic benefits for their 
progeny. Additional evidence of this sort has added considerable strength to 
the notion that females (as well as males) in socially monogamous species 
do indeed obtain a variety of fitness benefits from extra-pair matings that 
underlie what is actually a genetically polygamous mating system (Gowaty 
1996; Gray 1998; Ketterson et al. 1998). 

Although EPFs may often benefit females (as well as cuckolding males), 
any males that get cuckolded would of course be disadvantaged by this 
phenomenon, leading to selection pressures on males not only for cuckoldry 
avoidance, but also for paternity assurance coupled to nestling investment 
(Maller and Cuervo 2000). Reed buntings (Emberiza schoeniclus) have excep- 
tionally high extra-pair paternity, with 55% of 216 assayed young in 86% of 
58 nests showing this phenomenon in one DNA fingerprinting study (Dixon 
et al. 1994). By combining these genetic results with field data on magni- 
tudes of paternal investment in broods, these authors showed that males 
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apparently can assess their likelihood of paternity and adjust their nestling 
provisioning rates accordingly In similar studies of the zebra finch 
(Taeniopygia guttata)—another passeriform species with a high proportion 
(2876) of illegitimate young due to EPFs—males that appeared to be unat- 
tractive to females were found to accrue fitness gains by adopting a high 
parental investment (PI) strategy, whereas males that were attractive to 
females tended to benefit more by decreasing PI and increasing allocation to 
EPF behaviors (Burley et al. 1996). 

Other detailed avian studies have integrated demographic or behav- 
ioral observations with genetic information on parentage. In both purple 
martins (Progne subis) and Bullock's orioles (Icterus galbula), DNA finger- 
printing showed that older males achieved much higher reproductive suc- 
cess than did younger males, a result attributed to forced copulations by 
older males in the case of the martins (Morton et al. 1990) and to active 
female choice in the orioles (Richardson and Burke 1999). In a DNA finger- 
printing study of polygynous red-winged blackbirds (Agelaius phoeniceus), 
the proportion of illegitimate chicks was found to be significantly greater in 
marshes with higher male densities, and the cuckolding males were often 
territorial neighbors (Gibbs et al. 1990). In a behavioral and DNA finger- 
printing analysis of the blue tit (Parus caeruleus), Kempenaers et al. (1992) 
found that attractive males (those receiving many visits from neighboring 
females) were larger, survived better, and suffered less loss in paternity (had 
fewer extra-pair young, in their own nests) than did unattractive males. 
These results were interpreted as supportive of a "genetic quality hypothe- 
sis" wherein females assess male quality and mate preferentially with supe- 
rior individuals. In a series of studies on the barn swallow (Hirundo rustica), 
molecular markers coupled with experimental approaches revealed that 
males with longer and more symmetrical tail streamers tended to have 
increased paternity assurance within their own nests as well as more off- 
spring in extra-pair broods, but that these fitness advantages via sexual 
selection were partially offset by natural selection against long-tailed indi- 
viduals (Møller 1992; Maller and Tegelstróm 1997; Meller et al. 1998; Saino 
et al. 1997; H. G. Smith et al. 1991). 

For the many birds that live in social groups, it is of interest to know 
which individuals actually participate in reproduction. The Galápagos 
hawk (Buteo galapagoensis) has an unusual social arrangement typically con- 
sisting of one adult female and up to eight oft-unrelated males. DNA fin- 
Berprinting assays of 66 hawks from ten breeding groups confirmed that the 
mating system is polyandrous ("cooperative polyandry" in this case), with 
males within a group exhibiting rather egalitarian reproductive success 
(Faaborg et al. 1995). Another form of grouping involves lekking behavior, 
wherein individuals assemble in a communal area for courtship display. 
Dominant males in such leks are often assumed to achieve a disproportion- 
ate share of successful matings. However, in a study that combined field 
observations with a microsatellite assessment of paternity in one such 
species, the buff-breasted sandpiper (Tryngites subruficollis), the variance in 
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male reproductive success proved to be much lower than expected (Lanctot 
et al. 1997), due to multiple mating by females (sometimes with males off 
the leks as well) and to the use of alternative reproductive tactics by males 
(Lanctot et al. 1998). 

In some birds, such as the acorn woodpecker (Melanerpes formicivorus) 
and stripe-backed wren (Campylorhynchus nuchalis), young often remain in 
their natal groups and appear to assist adult kin in rearing new broods. 
Under sociobiological theory, postponement of dispersal and breeding to 
assist in the rearing of others’ progeny may be favored by natural selection 
if a juvenile's gain in inclusive genetic fitness from helping to produce close 
kin exceeds its expected gain in personal reproductive fitness had it dis- 
persed (Brown 1987; Hamilton 1964). In the acorn woodpecker, DNA fin- 
gerprinting analyses revealed that helpers at the nest indeed are genetic off- 
spring of the monogamous pair of parents that they assist (Dickinson et al. 
1995). However, in the stripe-backed wren, DNA fingerprinting studies 
demonstrated a more direct avenue by which helper fitness can be 
enhanced. In some social groups, auxiliary males formerly thought to be 
nonreproductive actually sired some offspring (Rabenold et al. 1990). Such 
reproduction by subordinate male wrens may further help to explain their 
long tenure as helpers at the nest. In contrast, only dominant fernale wrens 
proved to be reproductively successful, a result interpreted as consistent 
with the proclivity of young females to compete for breeding sites outside 
the natal group. 

In DNA analyses of another cooperatively breeding bird, the Arabian 
babbler (Turdoides squamiceps), one additional feature of reproduction by 
subordinates was uncovered. In this species, beta males that sired young 
proved to have significantly lower genetic similarity to the alpha male in 
their group than did those without offspring (Lundy et al. 1998). Thus, these 
more successful beta males probably were immigrants into the cooperative 
breeding groups, whereas the less successful beta males may have been 
stay-at-home young. Finally, in a similar microsatellite-based paternity 
analysis of cooperatively breeding carrion crows (Corvus conone), Baglione et 
al. (2003) found that although young male birds leave their natal groups to 
visit various others, they tend to settle and compete for matings in groups 
made up of individuals to whom they are moderately related (rather than 
unrelated). These results suggest that settlement patterns in this species are 
not just a passive consequence of random dispersal behavior, but instead 
register an active preference for association with kin (as might be predicted 
if kin selection plays an important evolutionary role in shaping cooperative 
reproductive alliances). 

In addition to such species-focused studies, statistical summaries of 
data have identified several factors correlated with avian EPFs. For exam- 
ple, EPF frequencies often tend to be higher in species in which males are 
brightly plumaged (Meller and Birkhead 1994), have relatively large testes 
(Moller and Briskie 1995), and provide little or no offspring care (Møller and 
Birkhead 1993b), as well as in species in which females seem able to com- 
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pensate for an absence of paternal care and also may receive important indi- 
rect fitness benefits from EPFs (Meller 2000). Reportedly, the incidence of 
EPFs is also higher in species with high molecular variation (a result inter- 
preted as advantaging an EPF-seeking female by increasing genetic variety 
among her progeny; Petrie et al. 1998), in species with pronounced sexual 
dichromatism (Meller 1997), and in species that seem to have strong 
immunological defense mechanisms (a result interpreted as profiting EPF 
families by conferring offspring with superior resistance to virulent para- 
sites; Maller 1997). Thus, such meta-analyses of the published literature 
have identified a number of empirical correlates and plausible causal inter- 
relationships between avian genetic mating systems and various evolution- 
ary and ecological phenomena such as sexual selection, sexual dimorphism, 
and individual behavioral tactics (Bennett and Owens 2002). 


FISHES. Molecular appraisals of parentage and genetic mating systems in 
fish populations did not become popular until the late 1990s (more than a 
decade after the molecular revolution in avian parentage began), but since 
then this field too has blossomed (see reviews in Avise 2001d; Avise et al. 
2002). There are several reasons for special interest in fishes. First, unlike 
most birds (or mammals), fish typically have huge clutches that afford inter- 
esting statistical challenges (see Box 5.4) as well as novel biological oppor- 
tunities for genetic parentage assessment. For example, de novo "clustered 
mutations" (which arise pre-meiotically in germ cell lineages and may enter 
the population not as singletons at a locus, but rather in clusters involving 
multiple siblings within a brood) are best sought and analyzed in species 
with large clutches (Jones et al. 1999a; Woodruff et al. 1996). Second, as indi- 
cated by a copious natural history literature, fishes collectively display 
diverse reproductive behaviors, ranging from pelagic group spawning to 
cooperative breeding to social monogamy, and this behavioral variety pro- 
vides rich fodder for genetic assessments. Third, parental care in various 
fish species may be nonexistent, confined to one gender, biparental, or com- 
munal, and it can take such varied forms as oral or gill brooding, use of nat- 
ural or constructed nests, open-water guarding of fry, or internal gestation 
by a pregnant mother or a pregnant father. Approximately 89 of 422 taxo- 
nomic families of bony fishes (21%) contain at least some species with 
parental care, and in nearly 70% of those families, the primary or exclusive 
parental custodian is the male (Blumer 1979, 1982). Paternal care of offspring 
is otherwise rather rare in the biological world (notable exceptions involve 
anuran amphibians; Clutton-Brock 1991; Wells 1977), and thus it affords a 
valuable mirror-image perspective on reproductive behaviors compared 
with the typical situation in most mammals, birds, and other groups, in 
which the female is normally the primary care-giver. 

Molecular appraisals (almost always involving microsatellite markers) 
of fish parentage and mating systems usually have involved nest-tending 
species, in which dozens of tiny embryos in a nest are collected and indi- 
vidually genotyped, together with the nest-resident or "bourgeois" male 
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(the suspected father) as well as other individuals in the vicinity. If the focal 
male was indeed the true sire, each offspring in his nest should carry one 
or the other of his alleles at each autosomal locus (barring de novo muta- 
tion), and maternity of the clutch can therefore be deduced by subtraction 
(using the logic illustrated in Figure 5.8). If some or all offspring in the nest 
were not sired by the resident male, this too should be evident (as a pater- 
nity exclusion) when they consistently fail to display that male's alleles. 
The most likely biological explanation for nonpaternity within a nest then 


90» =s a 
HR ———— ee mz T 





— Xu t Embryos ————» 
* 3S 
n E 


-— 
a 
ie" 


Figure5.8 Genetic parentage analysis within a clutch tended by one known 
biological parent, in this case a pregnant male pipefish (Syngnathus scovelli). 
Shown is an autoradiograph of a microsatellite gel displaying banding patterns of 
standard controis (two leftmost lanes), of the pregnant male fish (third and fourth 
lanes from the left), and of each of 18 embryos taken from his brood pouch. Note 
that the father is heterozygous at this locus for alleles 188 and 138, and that each of 
his progeny displays one or the other of these alleles. Thus, the allele of maternal 
origin in each offspring is apparent by subtraction. Note also that four different 
maternal alleles are represented among these progeny, indicating that at least two 
dams (both presumably heterozygous) were involved. By combining such data and 
examining allelic associations across loci in many progeny (see DeWoody et al. 
2000d), refined estimates of maternity within such a brood can be achieved. (From 
Jones and Avise 1997a.) 
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requires considered judgment that integrates the genetic findings with 
whatever natural history or other field information may be available in that 
particular instance. 

One primary discovery from such appraisals is that a given fish nest 
often contains half-sib offspring from multiple (typically 2-8) dams 
(DeWoody and Avise 2001). Such multiple mating by the bourgeois male has 
been genetically documented in several species of Lepomis sunfishes 
(DeWoody et al. 1998, 2000b; Mackiewicz et al. 2002), Etheostoma darters 
(DeWoody et al. 2000c; Porter et al. 2002), Spinachia sticklebacks (Jones et al. 
1998a), Pomatoschistus sand gobies (Jones et al. 2001a,b), and Cottus sculpins 
(Fiumera et al. 2002b). This pattern of multiple maternity within a nest is so 
prevalent that departures from it are of special interest. For example, genet- 
ic data demonstrated that nearly all nestmate embryos in a surveyed popu- 
lation of largemouth bass (Micropterus salmoides) were full sibs (not half-sibs 
or unrelated individuals). Thus, genetic monogamy apparently prevails in 
largemouth bass, a species that is also unusual among fishes in that both the 
sire and dam tend the offspring (DeWoody et al. 2000e). 

A second common finding from genetic appraisals is occasional foster 
parentage wherein not all embryos within a nest were sired by the resident 
male. One documented route to such nonpaternity is cuckoldry, a phenom- 
enon long studied in the bluegill sunfish, Lepomis macrochirus. In bluegill 
populations examined in Canada, three types of males exist: bourgeois 
males that mature at about 7 years of age and construct saucer-depression 
nests in colonies, attract and spawn with females, and vigorously defend 
nests and embryos; precocious sneaker males, 2 to 3 years of age, that often 
dart into a nest and release sperm; and older satellite males that mimic 
females in color and behavior, but also release sperm as the primary couple 
spawns (Gross and Charnov 1980). Genetic surveys (Colbourne et al. 1996; 
Neff 2001; Philipp and Gross 1994) have shown that about 20% of offspring 
in a bluegill colony are the result of cuckoldry by non-bourgeois males 
(Figure 5.9A). Studies have also suggested that bourgeois males can detect 
lost paternity and adaptively lower their level of parental care accordingly 
(Neff and Gross 2001). Cuckoldry has likewise been genetically document- 
ed in several other sunfish species, albeit at levels typically about an order 
of magnitude lower than in the bluegill (Figure 5.9B). 

Another suspected route to nonpaternity by bourgeois males involves 
nest takeovers, often provisionally evidenced when few or none of the off- 
spring in a given nest prove to have been sired by the resident male (see 
Figure 5.9B). Such nest piracies may be opportunistic responses by males to 
limited nest site availability, or perhaps a nest-holder captured at the time of 
sampling was merely a temporary visitor (e.g., there to cannibalize 
embryos). Yet another route to foster parentage by custodial males—egg 
thievery (wherein a few eggs are stolen from a neighbor's nest)—has been 
genetically as well as behaviorally documented in stickleback fishes (Jones 
et al. 1998a; Li and Owings 1978; Rico et al. 1992). Such egg-raiding behav- 
ior might seem highly counterintuitive, but one plausible explanation is that 
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Figure 5.9 Molecular findings on genetic paternity in several species of North 
American sunfishes. Shown are percentages of progeny per nest that proved to 
have been sired by bourgeois males in (A) 38 nests of Lepomis macrochirus and (B) a 
total of 104 nests in four other species of Centrarchidae (see text for references). 
(From Jones and Avise 1997a.) 


this behavior benefits the thief by seeding or “priming” his own nest with 
eggs, which are known to be effective in many fish species in eliciting 
spawning responses by additional females with whom the resident then 
mates (see review in Porter et al. 2002). 

In one taxonomic family of fishes, Syngnathidae, male care of offspring 
has been taken to the extreme. In all of the 200+ living species of pipefishes 
and seahorses, females lay eggs on the ventral surface (usually an enclosed 
brood pouch) of a male, who then gestates the eggs for weeks before giving 
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birth to dozens of offspring. Extensive molecular parentage analyses have 
been conducted on several syngnathid species in the genera Syngnathus 
(Jones and Avise 1997a,b; Jones et al. 1999b, 2001c), Nerophis (McCoy et al. 
2001), and Hippocampus (Jones et al. 1998b, 2003). In sharp contrast to nest- 
tending fishes, no instances of cuckoldry by foreign males were detected (as 
might be expected for male-pregnant species with mostly internal fertiliza- 
tion). This complete assurance of paternity in the Syngnathidae in turn facil- 
itates genetic analyses of maternity and mating systems. Several different 
outcomes among the surveyed species have been uncovered, ranging from 
genetic monogamy to polygynandry to polyandry. Furthermore, when these 
genetic findings were interpreted in the context of observed levels of sexual 
dimorphism and the presumed intensities and directions of sexual selection 
in the various species, results generally appeared compatible with conven- 
tional wisdom (as summarized in Figure 5.7) for taxonomic groups that 
include species with strong proclivities toward polyandry and "sex role 
reversal" (Jones and Avise 2001; Jones et al. 2000). For example, assayed syn- 
gnathid species that proved to have a polyandrous genetic mating system 
displayed greater sexual dimorphism and more pronounced secondary sex- 
ual characters in females than did monogamous species. 

Another interesting finding to emerge from genetic parentage assess- 
ments in fishes is the first firm documentation in nature of filial cannibalism 
(eating one’s own biological offspring). This phenomenon had been sus- 
pected from field observations that fish sometimes eat embryos from their 
own nests, but with genetic discoveries of widespread foster parentage, the 
possibility was raised that perhaps most cannibalism events were directed 
toward non-relatives rather than kin. DeWoody et al. (2001) critically tested 
this proposition by genotyping freshly eaten embryos dissected from the 
stomachs of wild-caught adult male sunfish and darters. Each of several 
dozen such embryos did indeed prove to have been consumed by its own 
biological father. 


PLANTS. Fatherhood in plants results from the spread of pollen, as medi- 
ated, for example, by insect pollinators or wind. Questions concerning 
pollen sources can be addressed by the same general types of molecular 
parentage analyses as described above for animals (Adams et al. 1992; 
Devlin and Ellstrand 1990). The task again is simplified when the mother is 
known (e.g., as the bearer of the seeds in question), but it can remain diffi- 
cult when the poo! of potential pollen donors is large. Paternity and the mat- 
ing system may be addressed with regard to seeds within a fruit, fruits with- 
in a plant in a given season, or the lifelong seed set of an individual. 

Many plant species are hermaphroditic (meaning that a given individ- 
ual can produce both male and female gametes). Not all such specimens 
can self-fertilize, however, for several reasons: male and female flowers in 
a monoecious individual may mature at different times or be spatially sep- 
arated on the plant; stamens and stigma within a perfect flower (a flower 
possessing both male and female parts) may be positioned such that 
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mechanical pollen transfer is unlikely, or self-incompatibility genes may be 
present. These “self-sterility” genes are known to carry multiple alleles that 
appear to have been selected to prevent the possible deleterious effects of 
the intense inbreeding that self-fertilization entails. Operationally, when a 
maternal parent and a pollen grain share an allele at a self-sterility locus, 
sporophytic tissue discriminates against gametophytic tissue—for exam- 
ple, by inhibiting growth of the pollen tube down the style. Nevertheless, 
seif-fertilization clearly has evolved independently from outcrossing on 
numerous occasions (Stebbins 1970; Wyatt 1988), probably at least 150 
times in the Onagraceae alone (Raven 1979). 

Thus, one of the first genetic questions regarding a hermaphroditic 
species concerns the frequency with which self-fertilization (as opposed to 
outcrossing) takes place. When the female parent is known, the problem 
simplifies to one of paternity assessment, with the issue in this case being 
how often the individual plant that mothered an array of progeny or seeds 
can be excluded as the father of those genotypes. For example, any off- 
spring that exhibits alleles not present in its known mother must have aris- 
en through an outcrossing event. Several statistical models are available to 
quantify rates of selfing versus outcrossing (s and t, respectively, where s + 
t = 1) from genotypic information at one or more loci (Brown and Allard 
1970; Ennos and Clegg 1982; Ritland and Jain 1981; Schoen 1988; Shaw et al. 
1981). For example, a widely employed "mixed-mating" model (Brown 
1989; Clegg 1980) assumes that the mating process can be divided into two 
distinct components: random mating (i.e., random independent draws of 
pollen from the total population) and self-fertilization. This model may be 
especially appropriate for wind-pollinated species. A variant of this model 
that is likely to be more applicable to many insect-pollinated species 
assumes that outcrossing events within a family are correlated because 
they may involve successive pollen draws from a single male parent 
(Schoen and Clegg 1984). 

From allozymes and other genetic markers, "mating system parame- 
ters" (s and t) have been empirically estimated for dozens of plant species, 
ranging from small herbaceous forms (Galloway et al. 2003) to intermediate- 
sized succulents (Massey and Hamrick 1999; Nasser et al. 2001) to large trees 
(Ruter et al. 2000). In an early summary of the literature by Lande and 
Schemske (1985), the overall frequency distribution of outcrossing rates 
proved to be bimodal, with most species either predominantly selfing or 
predominantly outcrossing (Figure 5.10A). These authors interpreted this 
bimodality as consistent with a scenario in which outcrossing is selected for 
in historically large species with substantial inbreeding depression, where- 
as selfing is favored in species in which prior pollinator failure or popula- 
tion bottlenecks have reduced the level of inbreeding depression via purg- 
ing of deleterious recessive alleles. Empirical evidence does exist for high 
variance among plant species in degree of inbreeding depression (Schemske 
and Lande 1985), with outcrossers often exhibiting the highest reductions in 
fitness under inbreeding (although this may partly reflect a bias in the early 
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Figure 5.10  Outcrossing rates in plants. (A) Frequency distribution of mean out- 
crossing rates, as estimated from allozyme markers, for 55 hermaphroditic plant 
species. Lightly shaded portions of bars are animal-pollinated species; more darkly 
shaded portions are wind-pollinated species (Aide 1986). (After Schemske and Lande 
1985.) (B) Inter-population variation in outcrossing rate within each of five plant 
species: A, Lupinus succulentus; B, L. nanus; C, Clarkia exilis; D, C. tembloriensis; and E, 
Gilia achilleifolia. Solid circles are population means; horizontal lines represent 
observed ranges among conspecific populations. (After Schemske and Lande 1985.) 


literature, which included a disproportionate representation of selfing 
grasses and outcrossing trees; Aide 1986). Nevertheless, few hermaphrodit- 
ic plant species are “fixed” for either pure outcrossing or pure selfing, and 
conspecific populations in some species show huge variation along the self- 
ing-outcrossing continuum (Figure 5.10B). 








In hermaphroditic species in which outcrossing has been established, or 
in any dioecious species, the next genetic question is, which plants were the 
pollen donors for particular outcrossed offspring? As illustrated in Box 5.5, 
molecular genetic markers again can supply the answer. The approach 
involves comparing the diploid genotype of each seed or progeny with that 
of its known mother, and thereby deducing (by subtraction) the haploid 
genotype of the fertilizing pollen. Candidate fathers are then screened for 
diploid genotype, and paternity is excluded for those whose genotypes do 
not match the deduced pollen contribution to the progeny. Sometimes all 
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BOX 5.5 Paternity Assignment 


These data (taken from a much larger allozyme data set; see Ellstrand 1984) illustrate paternity 
assignment for five progeny from a known mother. The body of the table consists of observed. 
diploid genotypes at each of six loci in the wild radish, Raphanus sativus. 





Allozyme locus 
LAP PGI PGMI!  PGM2 6PGD IDH 
Known mother 12 11 11 12 13 1,1 
Potential fathers 
PHA. 22 12 23 22 33 11 
B 22 23 13 1,1 13 11 
Cc 1,2 12 13 1,1 1,2 12 
D 15 11 12 13 13 11 
E 23 2,2 12 12 1,1 13 
- ES. 22 ‘13 22 12 13 11 
G 1,1 1,2 1,2 12 33 11: 
H 11 12 12 12 13 22 
I 12 11 11 12 13 11 
J 12 23° 4 212 12 33 1,2 
K 22 1,2 13 2,2 33 11 
L 12 11 1i 23 13 11 
M 25 11 12 23 12 13 
N 11 11 12 11 11 11 à 
O 13 1,2 12 12 33 i1 
m Deduced paternity 
Offspring Gamete Assignment 
P 22 12 13 12 1,1 12 223-12 c 
Q 22 12 13 1,2 2,3 1,1 223-21 C 
R 12 12 13 11 12 11 -23121 C 
S 12 11 12 23 11 13  -12313 M 
T 22 11 1,1 13 2,3 13 211323 M 





Source: After Ellstrand 1984. 
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males except the true father can be excluded. When multiple candidates 
remain, procedures exist for assigning "fractional paternity" based on sta- 
tistical probabilities of being the father. In the first large-scale application of 
these approaches, Ellstrand (1984) employed six highly polymorphic 
allozyme loci to establish paternity for 246 seeds from nine maternal plants 
in a closed population of the wild radish, Raphanus sativus. Multiple pater- 
nity was found for all assayed progeny arrays from a maternal plant and for 
at least 85% of all fruits, with the minimum paternal donor number averag- 
ing 2.3. The wild radish is a self-incompatible, insect-pollinated species. 
Subsequent work established that most multiply sired fruits were the con- 
sequence of a single insect vector having simultaneously deposited pollen 
from several plants (a phenomenon known as “pollen carryover”), and that 
a considerable fraction (up to 44%) of seed paternity for some plants 
involved immigrant pollen from sources at least 100 m away (Ellstrand and 
Marshall 1985; Marshall and Elistrand 1985). 

In another classic allozyme study, in this case of a small forest herb, 
Chamaelirium luteum, Meagher (1986) established paternity likelihoods for 
575 seeds with known mothers. The distribution of inter-mate (pollen-flow) 
distances indicated that more nearby fertilizations had taken place than 
expected under random mating, but nonetheless some mating pairs were 
separated from one another by more than 30 m. A follow-up study of estab- 
lished seedlings (whose maternity was unknown) confirmed this pollen dis- 
persal profile and also demonstrated that pollen and seed dispersal dis- 
tances were similar (Meagher and Thompson 1987). Surprisingly, no rela- 
tionship was found in this species between the size of the male plant (seem- 
ingly indicative of reproductive effort) and paternity success (Meagher 
1991). Hamrick and Murawski (1990) conducted similar genetic paternity 
analyses on several tropical woody species and showed that a significant 
proportion of the pollen received by individuals came from relatively few 
pollen donors; many matings (30%-50%) appeared to take place between 
nearest neighbors, and about 10%-25% of matings involved long-distance 
pollen flow (greater than 1 km). Thus, the overall breeding structure 
appeared to have two components: a leptokurtic (i.e., peaked) pattern of 
pollen dispersal within populations, superimposed on a more even distri- 
bution of “background” pollen originating from outside the population. 

Paternity analyses in plants are often referred to as providing direct esti- 
mates of gene flow (albeit across a single generation), as opposed to the indi- 
rect estimates of historical plus contemporary gene flow that can come from 
estimates of population genetic structure (see Chapter 6). In a common exper- 
imental setting, progeny within a focal population are monitored for paternal 
alleles that by genetic exclusion must have come from outside (rather than 
inside) the plot. Several such direct appraisals of paternal gene flow have 
documented instances (often in high frequencies) of immigrant pollen having 
arrived from rather distant sources. For example, in three species of fig trees 
(Ficus), molecular markers indicated that more than 90% of the pollen came 
from at least 1,000 meters away (Nason and Hamrick 1997). Other insect-pol- 
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linated trees in which molecular paternity analyses have often revealed long- 
distance pollen flow include Calophyllum longifolium (Stacy et al. 1996), 
Pithecellobium elegans (Chase et al. 1996), Swietenia humilis (White et al. 2002), 
and Tachigali versicolor (Loveless et al. 1998). Two wind-pollinated species for 
which frequent long-distance pollen flow has likewise been documented by 
molecular paternity analyses are Quercus macrocarpa (Dow and Ashley 1998) 
and Pinus flexilis (Schuster and Mitton 2000). 

In native species of conservation concern, as well as in crop plants, 
genetic determinations of pollen sources often carry management or eco- 
nomic ramifications. For example, in a tropical tree of conservation interest, 
Symphonia globulifera, genetic paternity analyses demonstrated that a few 
large pasture specimens contributed disproportionately to the population's 
overall gene pool, thus producing a relative genetic bottleneck that would 
otherwise not have been apparent (Aldrich and Hamrick 1998). A more 
applied example involves commercial pine-seed orchards that provide a 
significant fraction of the zygotes used to establish tree plantations in the 
southeastern United States. One such seed orchard for the loblolly pine 
(Pinus taeda) in South Carolina was composed of grafted ramets of 50 loblol- 
ly clones that had been chosen and maintained for phenotypically desirable 
traits. Using allozyme markers, Friedman and Adams (1985) discovered that 
at least 36% of seeds from this orchard were fertilized by outside pollen, 
despite a surrounding 100-meter-wide buffer zone positioned explicitly to 
prevent such genetic contamination by non-selected males. Similarly, in a 
population of cultivated cucumbers (Cucurbita pepo), Kirkpatrick and 
Wilson (1988) showed by molecular markers that approximately 5% of prog- 
eny were fathered by native cucumbers (C. texana), an outcome illustrative 
of the potentials for appreciable genetic exchange that are known to exist 
between many cultivated crops and their wild relatives (Ellstrand 2003). 


Selected empirical examples by topic 


CONCURRENT MULTIPLE PATERNITY. Molecular assays of individual litters, 
broods, and clutches have demonstrated concurrent multiple paternity for a 
wide variety of species in nature. Apart from the numerous birds and the 
nest-tending fishes mentioned above, these include many species of mam- 
mals (e.g., Birdsall and Nash 1973; Gomendio et al. 1998; Hoogland and 
Foltz 1982; Taggart et al. 1998; Tegelstróm et al. 1991; Xia and Millar 1991), 
snakes and lizards (Garner et al. 2002; Gibbs and Weatherhead 2001; Gibson 
and Falls 1975; Olsson and Madsen 2001), alligators (Davis et al. 2001), 
aquatic and terrestrial turtles (Bollmer et al. 1999; Palmer et al. 1998; Pearse 
and Avise 2001; Pearse et al. 2002), amphibians (Halliday 1998; Tennessen 
and Zamudio 2003; Tilley and Hausman 1976), female-pregnant fishes 
(Chesser et al. 1984; Soucy and Travis 2003; Travis et al. 1990; Trexler et al. 
1997; Zane et al. 1999), ascidians (Bishop et al. 2000), mollusks (Avise et al. 
2004; Baur 1998; Emery et al. 2001; Gaffney and McGee 1992; Mulvey and 
Vrijenhoek 1981; Oppliger et al. 2003), platyhelminth flatworms (Pongratz 
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and Michiels 2003), and diverse arthropods and related groups (Baragona 
and Haig-Ladewig 2000; Brockman et al. 2000; Curach and Sunnucks 1999; 
Heath et al. 1990; Martyniuk and Jaenike 1982; Milkman and Zeitler 1974; 
Nelson and Hedgecock 1977; Parker 1970; Sassaman 1978; Walker et al. 2002; 
see also below). A fascinating natural history tour through this promiscuous 
biological world is provided by Birkhead (2000). 

A few of these studies involved socially monogamous species and thus 
their results were somewhat surprising. Many others involved socially 
polygynous species and thus confirmed suspicions that multiple copula- 
tions or inseminations could indeed result in multiple successful fertiliza- 
tions of a progeny cohort. For example, female Belding's ground squirrels 
(Spermophilus beldingi) are known to mate with several different males. 
Allozyme data established that an estimated 78% of litters were multiply 
sired, usually by two or three males (Hanken and Sherman 1981). In a sim- 
ilar study of an insect, the willow leaf beetle (Plagiodera versicolora), more 
than 50% of wild-caught females produced egg clutches with multiple sires 
(McCauley and O'Donnell 1984). On the other hand, not all molecular 
genetic analyses have uncovered evidence for multiple paternity within 
clutches. For example, using allozyme assays, Foltz (1981) demonstrated a 
high degree of genetic monogamy in the old-field mouse (Peromyscus 
polionotus) as did Ribble (1991) in DNA fingerprinting assays of the 
California mouse (P. californicus). 


ALTERNATIVE REPRODUCTIVE TACTICS. Alternative reproductive tactics 
(ARTS) are different behavioral modes employed by conspecific males (or 
females; Henson and Warner 1997) to achieve successful reproduction 
(Gross 1996; Taborsky 2001). They may be hard-wired genetic polymor- 
phisms, or they may reflect behavioral or other phenotypic switches related 
to environmental conditions (e.g., hormone levels during development), but 
in either case they co-occur as distinctive reproductive strategies within a 
population or species. Examples were introduced above in discussions of 
flanged versus unflanged orangutans and bourgeois, sneaker, and satellite 
males in bluegill sunfish. Another example involves salmon, males of which 
may spawn either as full-sized anadromous adults after returning from the 
sea or as dwarf precocious parr that have remained in fresh water. Parentage 
analyses based on molecular markers have provided unprecedented infor- 
mation on individuals’ relative reproductive success in populations dis- 
playing ARTs. For example, analyses of several populations of Atlantic 
salmon (Salmo salar) have shown that parr fertilize widely varying propor- 
tions (575-9096) of eggs at different spawning sites (Garant et al. 2001; 
Garcia-Vasquez et al. 2001; Jordan and Youngson 1992; Moran et al. 1996; 
Thomaz et al. 1997). 

In the spotted hyena (Crocuta crocuta), genetic parentage resulting from 
alternative male and female reproductive tactics was evaluated by 
microsatellite profiling of 236 offspring in 171 litters from three clans (East 
et al. 2002). Despite polyandrous mating and high frequencies of multiple 
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paternity (35% of litters), female choice and sperm competition (see below) 
appeared to counter or trump pre-copulatory male tactics. This was evi- 
denced by the finding that male hyenas rarely sired offspring with females 
whom they attempted to manipulate through monopolization or harass- 
ment, whereas males who invested energy and time in fostering amicable 
relationships with females proved to have sired most of the offspring. 

Another example of parentage dissected by genetic markers involves 
side-blotched lizards (Uta stansburiana). Males in this species have three 
ARTs, each of which trumps, but is also susceptible to, one other tactic, 
much as in the children’s game of rock-paper-scissors. One form of male 
has a blue throat, is territorial, and guards its mate. Another form has an 
orange throat and is hyper-territorial and polygynous, avidly mating with 
multiple females. A third form is yellow-throated and does not regularly 
defend a territory, but instead gains access to territories of defender males 
by mimicking a female and then sneaking copulations with resident 
females. Genetic parentage analyses coupled with field observations 
(Sinervo and Clobert 2003; Sinervo and Lively 1996; Zamudio and Sinervo 
2000) have shown that the mate-guarding strategy of blue-throated males 
usually enables them to avoid cuckoldry by yellow-throated males, but 
leaves them vulnerable to cuckoldry by more aggressive orange-throated 
males. However, by virtue of their hyper-aggressive behavior, orange- 
throated males often obtain territories so large that they are unable to 
defend their females against yellow-throated sneaker males. So, the repro- 
ductive tactic of yellow-throated males (rock) can smash that of orange- 
throated males (scissors), which can snip that of blue-throated males 
(paper), which can cover that of yellow-throated males. 


SPERM STORAGE. Following a copulation event, the reproductive tract of 
females in many species is physiologically capable of storing viable sperm 

for varying periods of time (Birkhead and Meller 1993a; Howarth 1974; 
Smith 1984): typically a few days in mammals, weeks in many insects and 
birds, months in some salamanders, and up to several years in some snakes 
and turtles. Traditional evidence for this conclusion came from direct obser- 
vations of live sperm (typically in special female storage organs referred to 
as spermathecae) and from the fact that captive females isolated from males 
for some period of time may continue to produce offspring (although the 
possibility that these progeny are parthenogenetic is not eliminated by this 
observation alone). 

In recent years, molecular-based parentage analyses have added to our 
understanding of sperm storage phenomena. One illustration of the 
approach, involving a natural population of painted turtles (Chrysemys 
picta), may also provide the current record for the longest period of female 
sperm storage genetically verified in any species. Pearse et al. (2001a) used 
microsatellite markers to deduce paternity in successive clutches of physi- 
cally tagged females. Exclusion probabilities were sufficiently high that 
unique-sire genotypes could be identified, and these genotypes sometimes 
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were evidenced in the offspring of clutches that particular females had laid 
across as many as 3 successive years. By hard criteria, the possibility that a 
female had re-mated each year with the same male could not be eliminated 
entirely, but this explanation was deemed highly implausible given the aso- 
cial nature of this species, its high dispersal capability, and the high local 
densities of males. Thus, almost certainly, long-term female sperm storage 
and utilization had been documented by the genetic evidence. 


SPERM AND POLLEN COMPETITION. The widespread occurrences of genetic 
polygyny, concurrent multiple paternity, ARTS, and extended female sperm 
storage in many species all indicate that sperm from two or more males are 
often placed in direct competition for fertilization of eggs within a female's 
reproductive cycle. Several morphological characteristics and reproductive 
behaviors of males have been interpreted as adaptations to meet the genet- 
ic challenges resulting from this supposed competition with another male's 
sperm (Parker 1970). For example, in many worms, insects, spiders, snakes, 
and mammals, a male secretes a plug that serves temporarily as a "chastity 
belt" to block a female's reproductive tract from subsequent inseminations. 
In many damselflies and dragonflies (Odonata), males have a recurved 
penis that physically scoops out old sperm (from other males) from a 
female's reproductive tract during mating, thus helping to account for the 
genetic observation that last-mating males tend to sire disproportionate 
numbers of progeny (C. G. Cooper et al. 1996; Hooper and Siva-Jothy 1996). 
Other widespread male behaviors that have been interpreted as providing 
paternity assurance in the face of potential sperm competition include pro- 
longed copulation (up to a week in some butterflies), multiple copulations 
with the same female, and post-copulatory mate guarding (Parker 1984). 

From a female's perspective, mechanisms to prevent competition 
among sperm from different males are not necessarily desirable, which can 
lead to intersexual reproductive conflicts of interest (Eberhard 1998; 
Knowlton and Greenwell 1984). Growing evidence also suggests that the 
reproductive tracts of females may often play a more active role than previ- 
ously supposed in post-copulatory choice of fertilizing sperm (Birkhead and 
Maller 1993b; Mack et al. 2003). These and related topics have made "sperm 
competition" one of the hottest topics in molecular ecology and evolution 
over the last two decades (Baker and Bellis 1995; Birkhead and Maller 1992, 
1998; Smith 1984). 

In individual clutches of multiply inseminated females, molecular 
markers can be employed to determine which among the competing males' 
sperm have achieved the fertilizations. Is the first-mating male at a repro- 
ductive advantage, or does the last-mating male achieve the highest fertil- 
ization success? Or is there no mating-order effect, the probability of fertil- 
ization instead merely being proportional to the number of sperm deposit- 
ed by each male (the "raffle" scenario)? These questions have been 
addressed using genetic markers for numerous animal species (see Birkhead 
and Moller 1992; Smith 1984 for pioneering reviews). In insects, it often 
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Figure 5.11 Outcomes of sperm competition in insects, typically as determined 
by genetic paternity analyses based on molecular markers. Shown is the frequency 
distribution (across more than 100 species) of the proportion of eggs fertilized by 
the second of two males to have mated sequentially with doubly inseminated 
females. (After Simmons and Siva-Jothy 1998.) 


proved to be the case (but not always; Laidlaw and Page 1984) that the last 
male to mate with a female sired most of the offspring (Figure 5.11). For 
example, in the bushcricket Poecilimon veluchianus, Achmann et al. (1992) 
showed by DNA fingerprinting that the last male to mate achieved more 
than 90% of the fertilizations. Mating in this species involves transfer of a 
large spermatophore to a female, who often copulates with several males 
and may eat some of the spermatophores after copulation. The genetic find- 
ings appeared to eliminate the possibility that nourishment gained by a 
female from the spermatophore "gift" of an early-mating male reflected a 
paternal investment strategy enhancing that male's fitness. 

The term "sperm displacement" conventionally was employed to 
describe the enhanced reproductive success exhibited by last-mating males. 
In an insect, the locust Locusta migratoria, an active "sperm flushing" process 
has been observed that probably contributes to the phenomenon (Parker 
1984). In the dunnock sparrow, males peck at the cloaca of a female before 
copulating with her, apparently causing her to eject sperm from previous 
matings (Birkhead and Moller 1992). In other cases, mechanisms of sperm 
displacement appear less active. In chickens and ducks, for example, semen 
from different inseminations is stored in separate layers in the female repro- 
ductive tract, with the most recent contribution remaining on top and there- 
fore perhaps most likely to fertilize the next available egg (McKinney et al. 
1984). For such instances, more neutral terms, such as "sperm predomi- 
nance" (Gromko et al. 1984) or "sperm precedence," may be preferred. In 
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some birds, both raffle competition and sperm precedence are known to 
operate, but over different time scales. If inseminations occur more than 
about 4 hours apart, then last-male sperm precedence tends to operate, but 
a sperm raffle characterizes the process when two males inseminate a 
female in rapid succession (Birkhead and Meiler 1992). 

In a small proportion of insect species (see Figure 5.11) and in various 
other animals, first-mating males appear to have the fertilization advantage. 
For example, in the intertidal copepod Tigriopus californicus, allozyme stud- 
ies showed that virtually all of a female's progeny are fathered by her first 
mate (Burton 1985). In this species, a male often clasps a female for a period 
of several days before her sexual maturation. In light of the genetic obser- 
vations, Burton interpreted this prolonged clasping behavior by males as a 
pre-copulatory mate-guarding strategy to ensure that a potential mate has 
not been inseminated previously. 

In the relatively asocial ground squirrel Spermophilus tridecemlineatus, 
synchronously breeding females are scattered spatially at low densities. As 
a consequence of this natural history, the mating system probably conforms 
to what has been labeled "scramble-competition polygyny." Indeed, behav- 
ioral observations suggest that the strongest phenotypic correlate of male 
mating success is male mobility during the breeding season, presumably 
because traveling males are more likely to encounter females in estrus 
(Schwagmeyer 1988). Using allozyme markers, Foltz and Schwagmeyer 
(1989) discovered that in wild populations of this species, the first male to 
copulate with a multiply mated female sired on average about 75% of the 
resulting progeny. These results were interpreted to indicate that a mating 
advantage for first males during pre-copulatory scramble competition 
translates into a genetic advantage during the ensuing post-copulatory 
sperm competition. 

A remarkable example of first-male fertilization advantage was report- 
ed for the spotted sandpiper (Actitis macularia). In this polyandrous avian 
species with strong tendencies toward behavioral sex role reversal (includ- 
ing nest-tending by males), territorial females pair with, defend, and lay 
clutches for several males. Molecular studies based on DNA fingerprinting 
showed that males pairing early in the mating season cuckold their females' 
later mates by means of sperm storage in the females’ reproductive tracts 
(Oring et al. 1992). Thus, not only does an early-pairing male have a greater 
confidence of paternity, but he thereby also appropriates the reproductive 
efforts of subsequent males toward enhancement of his own fitness. 

The intriguing idea of sperm sharing was advanced for some species of 
hermaphroditic freshwater snails (Monteiro et al. 1984). According to this 
suggestion, a snail might pass on sperm from a previous mate to another 
partner, such that the transmitting individual acts mechanically as a male 
but achieves no genetic contribution to progeny. However, an empirical test 
of this hypothesis based on allozyme markers failed to support the sperm- 
sharing hypothesis (Rollinson et al. 1989). Instead, hermaphroditic snails 
proved capable of passing on their own sperm while still producing eggs 
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fertilized by sperm received from an earlier mating. A variety of other issues 
regarding sperm competition in hermaphroditic species are reviewed by 
Michiels (1998). 

In plants, opportunities also exist for competition among male gametes 
from different donors, as, for example, via differing rates of pollen tube 
growth through stigmatic tissue toward the egg (Snow 1990). Thus, pollen 
competition in plants is the analogue of sperm competition in animals 
(Delph and Havens 1998). Using allozyme markers to establish paternity, 
Marshall and Ellstrand (1985) demonstrated that most of the seeds in multi- 
ply sired fruits of the wild radish (Raphanus sativus) resulted from the first 
in a series of sequential pollen donors. Further study revealed that gameto- 
phyte competition among several pollen donors was more pronounced than 
that among male gametophytes from a single pollen source (Marshall and 
Elistrand 1986). In the morning glory Ipomoea purpurea, similar allozyme 
analyses also revealed a strong fertilization advantage for first-pollinating 
males, even when pollen donations from a second source occurred immedi- 
ately after the first (Epperson and Clegg 1987). In paternity studies of the 
herbaceous plant Hibiscus moscheutos, allozyme markers revealed that indi- 
viduals with fast-growing pollen tubes síred a disproportionate number of 
seeds following mixed experimental pollinations (Snow and Spria 1991). 
More examples of pollen competition, in the context of barriers to interspe- 
cific hybridization, will be provided in Chapter 7. 


MATERNITY ANALYSIS. In many taxonomic groups, such as mammals, 
maternity is usually more evident than paternity from direct behavioral 
observations, but in some cases the biological mother of particular offspring 
nonetheless remains in doubt. Tamarin et al. (1983) used an ingenious 
method for maternity assignment in small mammals. They injected preg- 
nant or lactating females with unique combinations of gamma-emitting 
radionuclides (e.g., Co Sr ®Zn), which were transferred to progeny via 
placenta or mother's milk. The isotopic profiles of young were determined 
spectrophotometrically and matched against those of prospective mothers 
to establish maternity (assuming that mothers nurse only their own off- 
spring). Sheridan and Tamarin (1986) combined this method of maternity 
assignment with protein electrophoretic analyses to assess parentage in 40 
offspring from a natural population of meadow voles (Microtus pennsylvan- 
icus). Knowledge of maternity facilitated paternity analyses and led to the 
conclusion that about 38% of the adult males in the population bred suc- 
cessfully in the surveyed time period, fathering at most two litters each. 
Each spring, pregnant females of the Mexican free-tailed bat (Tadarida 
brasiliensis) migrate to caves in the American Southwest and form colonies 
often containing several million individuals. Most females produce single 
pups, which within hours of birth are deposited on the cave ceilings or walls 
in dense creches. Lactating females return to the creches and nurse pups twice 
each day. Traditional thought was that nursing must be indiscriminate, such 
that mothers act "as one large dairy herd delivering milk passively to the first 
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aggressive customers" (Davis et al. 1962), but McCracken (1984) challenged 
this view with protein electrophoretic evidence indicating that nursing was 
selective along genetic lines. This conclusion stemmed from comparisons of 
observed allozyme genotypes in female-pup nursing pairs with the expected 
frequencies of such genotypic combinations if nursing were random. A high- 
ly significant deficit of maternal genetic exclusions (relative to expectations 
from population genotype frequencies) indicated selective nursing by females 
of their own pups (or at least those of close kin). McCracken estimated that 
only 17% of the assayed females were nursing pups that could not be their off- 
spring. A DNA fingerprinting analysis of maternity roosts in another bat 
species, Myotis lucifugus, likewise led to the conclusion that females preferen- 
tially suckle their own young (Watt and Fenton 1995). 

VandeBerg et al. (1990) employed protein electrophoretic markers to 
validate pedigrees in captive squirrel monkeys (genus Saimiri). Among 89 
progeny for which parentage had been inferred from behavioral observa- 
tions, assignments for seven individuals proved incorrect, and retrospective 
examination of colony records in conjunction with further genetic typing 
permitted a correction of pedigree records. Five of the errors had involved 
cases of mistaken paternity, but two involved mistaken maternity. These lat- 
ter cases apparently were the consequence of infant swapping between 
dams shortly after birth, an "allomaternal" behavior that previously had 
gone unrecognized. 

Far more commonly, questions about maternity arise in oviparous ani- 
mals such as birds, fishes, and insects, in which prolonged care of eggs out- 
side the female's body opens possibilities for intraspecific brood parasitism 
or other means of egg or progeny mixing. Indeed, as described above, pater- 
nity in fishes is normally more field-evident than maternity (due to the 
prevalence of male parental care), so genetic maternity is typically one of the 
prime foci of molecular parentage analyses. In birds, traditional methods for 
inferring IBP include monitoring nests for supernormal clutch sizes, notic- 
ing the appearance of eggs deposited outside the normal laying sequence of 
the resident female, or detecting intra-clutch differences in the physical 
appearance of eggs in those species in which inter-clutch differences in egg 
patterning are pronounced. Molecular approaches provide more direct 
maternity assessments. For example, in wild zebra finches, a DNA finger- 
printing analysis of 92 offspring from 25 families revealed that about 11% of 
offspring and 36% of broods resulted from IBP, and that the mean number 
of parasitic eggs per clutch was greater than one (Birkhead et al. 1990). In 
house wrens (Troglodytes aedon), a similar study based on allozymes led to 
the conclusion that about 30% of chicks were produced by females other 
than the nest attendant (Price et al. 1989). 

Genetic markers have also been used to address issues concerning inter- 
specific brood parasitism, a phenomenon in which females of one species 
surreptitiously lay their eggs in nests of other species. At molecular issue in 
this case is not how often this behavior occurs in nature (this is often evident 
from direct field observations, because eggs and young of the species 
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involved are usually visually distinguishable), but rather how often brood 
parasitic behaviors have arisen in evolution. By mapping the phenomenon 
of brood parasitism (as opposed to personal nesting) onto an mtDNA-based 
molecular phylogeny for 15 species of cuckoos, Aragon et al. (1999) con- 
cluded that the phenomenon had a polyphyletic origin in the order 
Cuculiformes, having arisen separately in at least three well-defined clades. 
In one European species of interspecific brood parasite, the common cuckoo 
(Cuculus canorus), similar genetic analyses further showed that different 
gentes (“races” with different egg-color patterns) represent distinctive 
matrilines that nonetheless are closely similar to one another in their overall 
genetic makeup (Gibbs et al. 2000a; Marchetti et al. 1998). 


POPULATION SIZE. Genetic parentage analyses are also informative in 
terms of estimating local population size in at least two ways. First, by pin- 
pointing which adults have actually sired and dammed progeny, molecular 
markers can offer better assessments of variance in reproductive success 
and of effective population size (see Box 2.2) than can mere census counts of 
potentially breeding adults (e.g., Hoelzel et al. 1999). Such knowledge can 
be important, for example, in assessing the magnitude of inbreeding in 
small captive or managed populations (Pope 1996). 

A second means by which molecular parentage analyses can provide 
information about population numbers was introduced by Jones and Avise 
(19975). In wildlife biology, a traditional approach is to use physical traps in 
mark-recapture protocols to estimate the contemporary size of a deme 
(Seber 1982). Under the oft-used Lincoln-Peterson statistic, for example, the 
number of individuals in a population is estimated as: 


(r4 * 1)(n; +1) 
(m, -1)-1 


where n, is the number of animals captured and physically marked in an 
initial sample, n, is the number of animals caught later, and m, is the num- 
ber of recaptured (marked) animals in the second sample (Pollock et al. 
1990). The parentage analysis approach is a genetic analogue of this tradi- 
tional method, in which the initial “marks” are, for example, the deduced 
genotypes of males (1,) who sired progeny in assayed clutches. Genotypes 
of adult males from the population can then be considered the second sam- 
ple (n,), and those males that perfectly match deduced paternal genotypes 
in the clutches are considered “recaptures” (7). By plugging these geneti- 
cally deduced parameter values into the Lincoln—Peterson equation, the cur- 
rent size of the adult breeding population can be estimated. 

Pearse et al. (2001b) explored several variations on this theme. For exam- 
ple, in a population that is monitored over multiple breeding seasons, both 
marks and recaptures could come from the genetically deduced paternal (or 
maternal) genotypes in successive clutches. This method also has the advan- 
tage that there is never a need to physically trap (or even observe) the alternate 
sex, because polymorphic genes provide the marks and breeding individuals 
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of one sex in effect provide both the captures and the recaptures of the oppo- 
site gender (via mating). Also, the resulting estimate of n for a given popula- 
tion refers explicitly to successful breeders (as opposed to all individuals), and 
thus may be of special interest in many ecological circumstances. 


SUMMARY 


3. Qualitative molecular markers from highly polymorphic loci provide powerful 
tools for assessing genetic identity versus non-identity and biological parent- 
age (maternity and paternity). 

2. In human forensics, DNA fingerprinting, first by VNTR loci and now by STR 
loci, provided a late-twentieth-century analogue of traditional fingerprinting. 
DNA fingerprinting has found wide application in civil litigation and criminal 
cases. Conservative procedures for calculating probabilities of a genotypic 
match can serve to ameliorate any potential biases against the defense. 


3. For plants and animals that are known or suspected to reproduce asexually 
(clonally) as well as sexually, several types of polymorphic molecular markers 
have been used to assess reproductive mode in particular populations, to 
describe spatial distributions of particular genets (clonal descendants from a 
single zygote), and to estimate evolutionary ages of clonal lineages. Some 
clones have proved to be unexpectedly ancient, but these are the exception, 
not the rule. 


4. In many microorganisms, including various bacteria, fungi, and protozoans, 
molecular markers have revealed unexpectedly strong proclivities for clonal 
reproduction in addition to mechanisms for occasional recombinational 
exchange of genetic material. These findings are of medical as well as academ- 
ic interest because they can influence strategies for diagnosis of disease agents 
and for development of vaccines and curative drugs. 


5. Molecular markers have found application in identifying genetic chimeras in 
nature, as well as in ascertaining gender in dioecious species, in which these 
features are not necessarily obvious from an inspection of external phenotypes 
alone. 


6. Molecular assessments of genetic parentage can identify an individual's sire 
and dam (or at least exclude most candidate parents) when maternity or pater- 
nity are uncertain from other evidence. Methods of empirical analysis are 
influenced by the nature of the particular parentage problem, the size of the 
pool of candidate parents, and numbers of offspring in a clutch. 


7. Individual clutches or broods in many vertebrate and invertebrate species have 
often proved upon molecular analysis to include varying proportions of foster 
young resulting from extra-pair fertilization (EPF) or sometimes intraspecific 
brood parasitism (IBP). Through such analyses, the distinction between social 
mating systems and genetic mating systems has become widely appreciated. 


8. Topics that have been informed through genetic parentage analyses include 
alternative reproductive tactics, sperm storage, sperm (and pollen) competi- 
tion, estimation of effective and census population size, and sociobiological 
patterns. In general, a powerful approach is to combine genetic parentage data 
with behavioral or other independent observations and interpret outcomes in 
the context of relevant ecological and evolutionary theory on mating systems, 
sexual selection, sexual dimorphism, and behavior. 
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Kinship and Intraspecific 
Genealogy 





Community of descent is the hidden bond which naturalists have been 
unconsciously seeking. 
C. Darwin (1859) 


Clonal identity and parentage, the subject of Chapter 5, are extreme examples of 
close kinship. In this chapter we shall be concerned with applying molecular 
markers to reveal genetic relatedness within and among broader groups of 
extended intraspecific kin. Questions of genetic relatedness arise in virtually all 
discussions of social species in which particular morphologies and behaviors 
might have evolved as predicted under theories of inclusive fitness and kin selec- 
tion (Box 6.1). Interest in kinship also arises for any species whose populations are 
spatially structured, perhaps along family lines. At increasingly greater depths in 
time, all conspecific individuals are related to one another through an extended 
pedigree that constitutes the composite intraspecific genealogy of a species. 


Close Kinship and Family Structure 


Molecular assessments of close kinship require qualitative genetic markers with 
known transmission properties, such as allozymes or microsatellites. However, 
compared with the rather straightforward situation in paternity and maternity 
analysis (in which genetic pathways connecting individuals extend across only 
one generation), appraisals of extended kinship are complicated by the fact that 
multiple generations and potential transmission pathways link more distant 
relatives. Thus, even when fairly large numbers of loci are assayed, the focus in 
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BOX 6.1 Within-Group Genetic Relatedness, Inclusive 
Fitness, and Kin Selection 


Genetic Relatedness 

Discussions of close kinship often require a quantitative measure of genetic relat- 
edness (f). An intuitive interpretation of a coefficient of relatedness is provided 
by an answer to the following question: What is the probability that an allele car- 
ried by the focal individual is also possessed by the relative in question? In other 
words, what is the expected proportion of alleles shared by these individuals’ 
genomes? In principle, r = 0.50 for full siblings and parent-offspring pairs (see 
Figure 6.1A); r = 0.25 for half-sibs or for an individual and its uncles, aunts, 
grandparents, and grandchildren; r = 0.125 for first cousins; and r = 0.0 for non- 
relatives. Generally, for any known pedigree, true values of r can be determined 
by direct pathway analysis of gene transmission routes (Cannings and 
Thompson 1981; Michod and Anderson 1979). 

In nature, however, pedigrees usually are unknown, so several statistical 
methods have been developed and tailored for estimating coefficients of relat- 
edness from polymorphic genetic markers, such as those provided by multi- 
locus allozymes (Crozier et al. 1984; Pamilo and Crozier 1982; Queller and 
Goodnight 1989), DNA fingerprints (Reeve et al. 1992), or microsatellites 
(Blouin et al. 1996; Henshaw et al. 2001; Queller et al. 1993; Strassman et al. 
1996; Van de Casteele et al. 2001). Some of these approaches entail estimating 
average relatedness in assemblages of individuals, as implemented by comput- 
er programs such as Relatedness (Queller and Goodnight 1989). For example, 
Pamila (1984a) derived an estimate for r that can be expressed in terms of het- 
erozygosities observed at a locus (hops) and those expected under Hardy- 
Weinberg equilibrium (k.p) within a colony m with N individuals, in compari- 


. son to heterozygosities observed (H ps) and expected (H,,,) within a broader 


population composed of c colonies: 


(6.1) 
Hap = ^ Hota . 


This coefficient-of relatedness may also be interpreted as a genotypic correlation 


among group members in a subdivided population (see Pamilo 1984a for deriva- 
tions and discussion). Other approaches entail estimates of relatedness between 
specific pairs of individuals (Epstein et al. 2000; Lynch and Ritland 1999; Ritland 
1996; Wang 2002), as implemented by computer programs such as Kinship, which 
uses a maximum likelihood statistical framework (Goodnight and Queller 1999). 


Inclusive Fitness and Kin Selection 


.Classic genetic fitness is defined as the average direct reproductive success of an 


individual possessing a specified genotype in comparison to that of other indi- 
viduals in the population. Inclusive fitness, which entails a broader view of the 
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transmission of genetic material across generations, incorporates the individual's 
personal or classic fitness as well as the probability that its genes may be passed 
on through relatives (Queller 1989, 1996). These latter transmission probabilities 
are influenced by the coefficients of relatedness involved. Concepts of inclusive 
fitness have been advanced as an explanation for the evolution of “self-sacrifi- 
cial” behaviors, wherein alleles influencing such altruism may have spread in cer- 
tain populations under the influence of kin selection. For example, under the 
proverbial example of altruistic behavior, an individual's alleles would tend to 
increase in frequency if his or her personal fitness was completely sacrificed for a 
comparable gain in fitness by more than two full sibs, four half-sibs, or eight first 
cousins. ‘ . 

In general, according to Hamilton's (1964) rule, a behavior is favored by kin 
selection whenever 


Aw, + XrAw, 20 ' (6.2) 


where Aw, is the change the behavior causes in the individual's fitness, Aw, isthe. ; 
change the behavior causes in the relative's fitness, and r is the genetic relatedness | 
of the individuals involved. Under Hamilton's rule, an allele will tend to increase 

in frequency if the ratio of the cost C that it entails (loss in expected personal 
reproduction through self-sacrificial behavior) to the benefit B that it receives 
(through increased reproduction by relatives) is less than r: 


C/B«r - (6.3) 


genetic studies of broader kinship often shifts from attempts to enumerate 
relationships among particular individuals (but see below) to a concern with 
patterns of mean genetic relatedness within and among groups. 

The concepts and reasoning that are involved in kinship assessment can 
be introduced by the following example (from Avise and Shapiro 1986): 
Juveniles of the serranid reef fish Anthias squamipinnis occur in social aggre- 
gations ranging in size from a few individuals to more than a hundred. 
Although eggs and larvae of this species are pelagic, drifting in the open 
ocean, Shapiro (1983) raised the intriguing hypothesis that juvenile aggrega- 
tions might consist of close genetic relatives (predominantly siblings from a 
single spawn) that had stayed together through the pelagic phase and settled 
jointly. If so, then kin selection would have to be considered as a potential fac- 
tor influencing behaviors within social aggregations, and furthermore, marine 
biologists would have to reevaluate the conventional wisdom that products of 
separate spawns are mixed thoroughly during the pelagic phase. To test the 
Shapiro hypothesis, genotypes were surveyed at each of three polymorphic 
allozyme loci in eight discrete social aggregations of juvenile A. squamipinnis 
from a single reef in the Red Sea. Allele frequencies are presented in Table 6.1. 
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TABLE 6.2 Representative examples of coefficients of genetic relatedness (r) estimated 
among females within colonies of various eusocial hymenopteran insects 


Species 


Ants 
Camponotus lingiperda 
Formica aquilonia 
Formica polyctena 
Formica sanguinea 


Formica transkaucasica 
Myrmecia pilosula 


Myrmica rubra 
Nothomyrmecia macrops 


Rhytidoponera chalybaea 
Rhytidoponera confusa 
Solenopsis invicta 


Wasps 
Agelaia multipicta 
Cerceris antipodes 
Microstigmus comes 


Parachartergus 
colobopterus 

Polybia occidentalis 

Polybia sericea 


Bees 
Apis mellifera 


Comparison 


Workers 
Workers 
Workers 
Workers 
Workers 
Workers 


Workers 
Workers 


Workers 
Workers 
Workers 
Workers 
Females‘ 
Females‘ 
Females‘ 
Females* 


Females* 


Workers 


0.08 
0.09 
0.19-0.30 
0.31-0.42 
0.33 
0.17 


0.02-0.54 
0.17 


0.76 
0.70 
0.01-0.08 
0.27 
0.25-0.64 
0.60-0.70 
0.11 
0.34 
0.28 


0.25-0.34 


Colonies and 
queen matings” 


Polygyne 
Polygyne 
Polygyne 
Polygyne; queens 
multiply-mated 
Polygyne; queens 
singly-mated 
Polygyne 


Polygyne 


Polygyne, 
occasionally 


Monogyne; queen 
singly-mated 

Monogyne; queen 
singly-mated 

Polygyne; queens 
singly-mated 


Polygyne 


Polygyne 
Monogyne; often 


singly-mated 
Polygyne, 

probably 
Polygyne 
Polygyne; often 

singly-mated 


Highly 
polyandrous 





Reference 


Gertsch et al. 1995 
Pamilo 1982 
Pamilo 1982 
Pamilo and 
Varvio-Aho 1979 
Pamilo 1981, 1982 


Craig and Crozier 
1979 

Pearson 1983 

Ward and Taylor 
1981 

Ward 1983 


Ward 1983 


Ross and Fletcher 
1985 


West-Eberhard 1990 

McCorquodale 1988 

Ross and Matthews 
1989a,b 

Queller et al. 1988 


Queller et al. 1988 
Queller et al. 1988 


Laidlaw and Page 
1984 





* Note how high relatedness within a nest depends on colonies being monogyne and possessing a queen who 


was singly-mated. 


* Based on microsatellite data. Other estimates came from protein-electrophoretic analyses and represent 
mean values. Additional examples can be found in Crozier and Pamilo 1996. 
* Reproductive and non-reproductive females not distinguished. 


species (Formica fusca) however, workers have been found to display 
favoritism toward their own kin when rearing eggs and larvae in polygyne 
colonies (Hannonen and Sundstróm 2003). 

Another suggestion is that high frequencies of polygyne colonies and 
multiple mating by queens represent derived behaviors, rather than the 
ancestral conditions under which eusociality evolved. Under this hypothe- 
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sis, eusociality tends to arise through kin selection when populations are 
highly structured along family lines, whereas subsequent maintenance and 
elaboration into advanced eusociality can occur even when within-colony 
relatedness decreases. Eusocial colonies, once formed, may operate so 
smoothly and successfully that the inclusive fitness of workers remains 
higher than if workers became egg-layers, such that evolutionary reversion 
to a less eusocial condition is simply not feasible. In ants, it is difficult to test 
the hypothesis that polygyne colonies and multiple mating by queens are 
derived conditions because most species are strongly eusocial. In a primi- 
tively eusocial bee, Lasioglossum zephyrum, a high molecular genetic estimate 
of intra-colony relatedness (r = 0.70) indicated that kin selection may oper- 
ate in this species (Crozier et al. 1987), but later analyses based on 
i microsatellite markers in another Lasioglossum species (malachurum) showed 

that about one-third of nests had been taken over by unrelated queens prior 
| to worker emergence (Paxton et al. 2002). Furthermore, in several wasp 
i species that also have primitive or incipient eusociality, r values within a 
| nest sometimes are only moderate to low (Strassmann et al. 1989, 1994). 
Although these wasps may not necessarily provide valid representations of 
ancestral behavioral conditions, the genetic findings do demonstrate that 
low within-colony relatedness is not confined to the most advanced 
hymenopteran societies. 

Finally, various ecological-genetic hypotheses have been advanced to 
explain the conundrum of low genetic relatedness within some hymenopter- 
an colonies. For example, high genetic diversity among nestmates might 
diminish susceptibility to infectious parasites (Shykoff and Schmid-Hempel 
1991a,b) or permit the colony to perform better in some environments (Cole 
and Wiernasz 1999). Or caste determination might have a partial genetic 
basis that is conceivably allowed fuller expression by multiple mating or the 
formation of polygyne colonies (Crozier and Page 1985). To the extent that 
| these or other strong adaptive benefits attend colonial living, the requirement 

of close kinship for eusociality should be somewhat relaxed. Another possi- 

bility is that collaborating queens fare proportionately better than individual 
| queens in competition for limited nest sites (Herbers 1986). Under this 
| hypothesis, concepts of inclusive fitness remain in partial effect if co- 
| foundresses are genetic relatives, as sometimes (but not always) appears to 
be the case. In various hymenopteran species, molecular genetic appraisals 
of co-founding queens have revealed mean relatedness values ranging from 
r = 0.00 to r = 0.70 (Metcalf and Whitt 1977; Ross and Fletcher 1985; Schwartz 
1987; Stille et al. 1991; Strassmann et al. 1989). 


OTHER ARTHROPODS. Highly eusocial systems (or behavioral components 
thereof) have been discovered in several other taxonomic groups, and these 
cases are valuable for the similarities and contrasts they provide with the 
eusocial hymenopterans. One remarkable example involves marine shrimp 
in the genus Synalpheus, in which individuals often live together by the hun- 
dreds within a large sponge. Field observations coupled with data from 
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allozyme markers have shown that these colonies are eusocial and that each 
typically consists of full-sib animals (Duffy 1996; Duffy et al. 2002). 
Synalpheus shrimp are diploid. So too are termites, another group in which 
eusociality is well developed (Wilson 1975). These shrimp and termites 
demonstrate that eusociality is not invariably coupled to haplodiploid sex 
determination, a conclusion also evidenced by the fact that a few other 
arthropod groups (some mites, thrips, whiteflies, scale insects, and beetles) 
are haplodiploid but do not exhibit eusociality (Wilson 1975). 

Several termite species possess sex-linked multi-chromosome transloca- 
tion complexes that serve to elevate genetic relatedness both between sisters 
and between brothers (Syren and Luykx 1977; Lacy 1980), but this odd 
genetic system also lowers genetic relatedness between male and female sib- 
lings and thus is difficult to rationalize as being a prime causal factor in the 
evolution of termite eusociality (Andersson 1984; Leinaas 1983). Cyclic 
inbreeding-outbreeding is another proposed model that might promote 
eusociality by altering genetic relatedness within and among groups in such 
a way as to promote kin selection (Bartz 1979; see also Pamilo 1984b; 
Williams and Williams 1957). When male and female mates are unrelated 
but each is a product of intense inbreeding, their offspring can be nearly 
identical genetically, but only 50% like either parent (Figure 6.1C). When 
such conditions hold, any genes that behaviorally dispose siblings to stay 
together and assist their parents in rearing young might be favored for 
inclusive fitness reasons similar to those described above for the hap- 
lodiploid hymenopterans (see, however, Crozier and Luykx 1985). Termites 
possess several natural history features that favor social interactions and 
might set the stage for such a breeding cycle, such as living in protected and 
contained nests conducive to multi-generation inbreeding and passing sym- 
biotic intestinal flagellates from old to young individuals by anal feeding 
(an arrangement that necessitates close social behavior; Wilson 1971). 

In accounting for the evolution and maintenance of eusociality, an 
emerging sociobiological view is that haplodiploidy per se may seldom 
be the deciding factor after all, but instead is (at best) merely one of sev- 
eral elements in a broader kin-selectior framework of cost-benefit fitness 
considerations. Queller and Strassmann (1998) describe several biological 
characteristics that consistently earmark two types of eusocial arthro- 
pods: “fortress defenders,” such as social aphids (Stern and Foster 1996), 
social beetles, and termites, which live inside a nest or protected site (a 
valuable resource that is both possible and necessary to defend as a 
group); and “life insurers,” such as ants, bees, and wasps, which forage in 
the open but nonetheless benefit from group behaviors because overlap- 
ping adult life spans are often needed to successfully care for young with- 
in the nest. In each case, the proposed benefits of sociality to an individual 
who cooperates closely with grouped kin (even if they are not extremely 
close relatives) presumably outweigh the high risks of go-it-alone person- 
al reproduction. 





| 


| Kinship and Intraspecific Genealogy 241 


NAKED MOLE-RATS. Another noteworthy example of eusociality involves 
a colonial vertebrate, the naked mole-rat (Heterocephalus glaber) (Jarvis 1981; 
Sherman et al. 1991). Brood care and other duties in this underground 
rodent species are performed cooperatively by mostly non-reproductive 
workers or helpers, who represent offspring from previous litters. The 
helpers assist the queen in rearing progeny that are fathered by a few select 
males within the burrow system. Using DNA fingerprint assays of colony 
members, Reeve et al. (1990) documented high band-sharing coefficients 
(0.88-0.99) comparable in magnitude to estimates for highly inbred mice or 
monozygotic twins in cows and humans. From these molecular data, they 
estimated mean within-colony genetic relatedness at r = 0.81, and accord- 
ingly suggested that a great majority of matings within a colony must be 
among siblings or between parents and offspring. Intense within-colony 
inbreeding is consistent with a strong role for kin selection in the evolution 
of eusociality in naked mole-rats. However, ecological and life history con- 
siderations are also important, as are phylogenetic constraints, as evi- 
denced by the fact that colonial and eusocial behaviors are displayed to 
widely varying degrees among different mole-rat species (Allard and 
Honeycutt 1992; Burda et al. 2000; Honeycutt 1992). For example, micro- 
satellite assays of a more outbred eusocial species (Cryptomys damarensis) 
yielded an estimate of mean within-colony relatedness of only r = 0.46 
(Burland et al. 2002). This finding suggests that even “normal” levels of 
family kinship within a colony can be sufficient for the evolution, or at least 
the retention, of eusociality in these mammals. It also suggests that while 
intense inbreeding and pronounced geographic population structure have 
been observed in mole-rats (Faulkes et al. 1997), these phenomena may first 
and foremost be responses to severe constraints on dispersal, especially 
given the predator-rich environments inhabited by these poor-sighted and 
rather defenseless animals (Braude 2000). 


Non-eusocial groups 


Most group-living species exhibit far less social organization and subdivi- 

sion of labor than do eusocial arthropods and mole-rats, but genetic relat- 

edness among group members remains of interest. A seminal compendium 

on known or suspected kinship in group-living animal species was provid- 

ed by Wilson (1975). Traditionally, such genealogical understanding came 

from difficult and labor-intensive field observations of mating and dispersal 

| (Fletcher and Michener 1987), but in the last three decades molecular mark- 
ers have assisted greatly in these evaluations. 

Eastern tent caterpillars (Malacosoma americanum), for example, are char- 

acterized by cooperative nest building as well as cooperative foraging along 

! pheromone trails. Adult moths of this diploid species lay egg masses from 

1 which first-instar larvae emerge to feed on leaves at the tips of tree branches. 

| Later, the caterpillars move to central locations in a tree to initiate tent (nest) 

: 

| 

| 
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construction. In a temporal genetic study using allozymes, Costa and Ross 
(1993) found that mean genetic relatedness within colonies of newly emerged 
larvae (from a single egg mass) was r = 0.49, not significantly different from 
the expected value of r = 0.50 for full siblings. However, during the ensuing 8 
weeks, relatedness values declined to between r = 0.38 and r = 0.25. This tem- 
poral reduction in intra-colony relatedness represented an erosion of the ini- 
tial simple family structure, apparently due to frequent exchanges of individ- 
uals among colonies after foragers encountered pheromone trails of non-sib- 
lings. The results indicated that immigrants are not overtly discriminated 
against, but rather can be accepted into a colony. Subsequent observations 
suggested an adaptive explanation (Costa and Ross 2003): The increased 
group size that results from acceptance of immigrants was found to enhance 
mean fitness by promoting larval growth and enhancing the final larval 
weights attained (which are highly correlated with adult reproductive suc- 
cess). Thus, in tent caterpillars, individual fitness benefits stemming from 
augmented group size apparently more than offset the dilution of biological 
relatedness in these genetically heterogeneous social groups. 

Similar genetic studies were conducted on day-roosting colonies of 
Phyllostomus hastatus bats in Trinidad. These colonies are subdivided into 
compact clusters of adult females that remain highly stable over several 
years and are attended by a single adult male, who from allozyme evidence 
sires most of the babies born to females within the “harem” (McCracken and 
Bradbury 1981). Stable groups of adult females are fundamental units of 
social structure in this species. It was hypothesized that harem females are 
matrilineal relatives, such that kin selection might be a plausible factor 
underlying their social or cooperative behavior. However, based on 
allozyme assays in conjunction with field observations, the females within 
each harem proved to be random samples from the total adult population, 
and hence were unrelated (McCracken and Bradbury 1977, 1981). These 
results indicated that juveniles are not recruited into parental social units 
and, therefore, that contemporary kin selection cannot explain the mainte- 
nance of behavioral cohesiveness in these highly social mammals. 

Conversely, several ground-dwelling squirrels in the family Sciuridae 
do have varying degrees of social organization built around matrilineal kin- 
ship (Michener 1983). For example, black-tailed prairie dogs (Cynomys 
ludovicianus) live in social groups (coteries) that typically consist of one or 
two adult males born outside the group, plus several adult females and 
young that are closely related. Females show strong tendencies to remain in 
their natal coteries for life (Hoogland and Foltz 1982). Genetic analyses 
based on pedigree and allozyme data documented that, despite this known 
matrilineal population structure, colonies are outbred due to coterie switch- 
ing by males and social avoidance of father-daughter matings (Foltz and 
Hoogland 1981, 1983; see also Chesser 1983; Dobson et al. 1997, 1998). 

The mound-building mouse (Mus spicilegus) constructs large earthen 
mounds containing nesting and food storage chambers. Each mound typi- 
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cally houses 3~10 animals. Garza et al. (1997) scored molecular markers at 
four autosomal and four X-linked microsatellite loci in individuals inhabit- 
ing 40 mounds in Bulgaria. Genetic results showed that at least two males 
and two females often had parented offspring in a mound, that parents of 
different sibships within mounds were more closely related than if they had 
been chosen at random from the population, and that adult females 
accounted for this excess relatedness. These genetic findings were interpret- 
ed to indicate that the mechanisms by which individuals congregate to build 
mounds are kin-based and that the evolution of communal nesting in this 
species could be due in part to kin selection. 

Many cetacean species (whales and dolphins) live in social groups 
called pods, which have been the subject of several studies employing 
molecular markers (see reviews in Hoelzel 1991a, 1994, 1998). For example, 
in the long-finned pilot whale (Globicephala melas), social groups typically 
consist of 50-200 animals. Their herding instincts have been exploited by 
native peoples to drive entire pods into shallow bays for slaughter. 
Analyses of DNA fingerprints from tissues taken from such harvests in the 
Faroe Islands revealed that adult males are not closely related to adult 
females within a pod, and furthermore, that 90% of fetuses had not been 
sired by a resident male (Amos et al. 1991a,b, 1993). From these data and 
behavioral observations, the authors concluded that social groups in the 
pilot whale are built around matrilineal kinship, with most inter-pod genet- 
ic exchange mediated by males. As deduced by mtDNA analyses as well, a 
tendency toward matrifocal organization of structured groups (either local- 
ly or associated with particular migratory pathways) is a recurring theme 
in several species of dolphins (e.g., Hoelzel et al. 1998a; Pichler et al. 1998) 
and whales (Baker and Palumbi 1996; Baker et al. 1998; Brown-Gladden et 
al. 1997; Hoelzel 1991b, 1998; Hoelzel and Dover 1991b; Hoelzel et al. 
1998b; O'Corry-Crowe et al. 1997; Palsbell et al. 1997a,b), One ramification 
of this structure is that populations can show significantly greater genetic 
subdivision in mitochondrial than in nuclear gene markers, as has been 
demonstrated, for example, in sperm whales (Physeter catodon) on a global 
scale (Lyrholm et al. 1999). 

Most reptiles are relatively asocial animals, but genetic and behavioral 
analyses of a large Australian lizard, Egernia saxatilis, have documented 
what is perhaps the first firm evidence for long-term "nuclear family" struc- 
ture in a reptilian species. Parentage and kinship assessments based on 
microsatellite markers revealed tendencies toward multi-year monogamy 
and group stability, with up to three annual cohorts of full-sib offspring liv- 
ing with their parents (O'Connor and Shine 2003). Overall, 85% of the sur- 
veyed juveniles lived in social groups, and 65% lived in family groups with 
at least one of their biological parents (39% with both parents). An entirely 
different outcome was reported in an avian species that had been suspected 
of spending extended periods of time in family groups. In a DNA finger- 
printing study of the long-eared owl (Asio otus), most birds in communal 
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winter roosts proved not to be close kin, as evidenced by the fact that mean 
genetic relatedness within roosts was not significantly higher than that 
between roosts (Galeotti et al. 1997). 

Questions concerning kinship also arise in grouped plants. The white- 
bark pine (Pinus albicaulis) frequently displays a multi-stem form. Allozyme 
analyses have demonstrated that stems within a clump are genetically dis- 
tinct individuals (genets), but are nonetheless more similar to one another 
than to individuals in other clumps (Furnier et al. 1987). This family struc- 
ture appears to be a direct result of seed-caching behavior by birds, espe- 
cially Clark's nutcrackers (Nucifraga columbiana), which often deposit multi- 
ple seeds from related cones at particular locations. 

The limber pine (Pinus flexilis) is another species that exhibits a multi- 
trunk growth form, perhaps registering similar seed-caching behavior by 
birds or perhaps registering the presence of multiple ramets from a single 
genetic individual. From allozyme analyses, nearly 20% of multi-trunk clus- 
ters proved to be composed of two to four genetically different individuals, 
and mean genetic relatedness within these clusters was r = 0.19, or slightly 
less than expected for half-sibs (Schuster and Mitton 1991). The authors note 
that such grouping of distinct but related genets opens the possibility of kin 
selection, a phenomenon seldom considered in plants. Occasional fusions or 
grafts among adjacent woody trunks are also observed in limber pines, and 
the authors found that fused genets were related significantly more closely 
than genets that were unfused. However, it remains uncertain whether such 
fusion behavior might have evolved in part under kin selection via possible 
adaptive advantages to the participants (such as joint translocation of water 
and nutrients, or added physical stability). 


Kin recognition 


The spatial co-occurrence of close kin in virtually all species raises addi- 
tional questions about whether individuals can somehow assess their genet- 
ic relatedness to others and perhaps adjust competitive, cooperative, altru- 
istic, or other behaviors accordingly (Waldman 1988; Wilson 1987). In study- 
ing such issues, ethologists traditionally monitored interactions among 
organisms supposed to exhibit varying levels of genetic relatedness as 
gauged by behavioral observations or by pedigree records in captive set- 
tings (Fletcher and Michener 1987; Hepper 1991). However, these conven- 
tional lines of evidence for kinship are less than fully reliable, and in any 
event are unavailable for many species. Molecular markers are now rou- 
tinely employed to assist with relatedness assessments, several examples of 
which have already been mentioned. 

Another classic example involved a free-living population of Belding's 
ground squirrels (Spermophilus beldingi) in California, for which Holmes and 
Sherman (1982) employed protein electrophoretic techniques to distinguish 
full siblings from maternal half-sibs resulting from multiple mating. 
Subsequent behavioral monitoring indicated that full sisters fought signifi- 
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cantly less often and aided each other more than did half-sisters. Such nepo- 
tism (favoritism shown kin) must require an ability by ground squirrels to 
judge relatedness. Additional experiments indicated that the proximate cues 
by which this is accomplished in S. beldingi involve physical association dur- 
ing rearing as well as “phenotypic matching,” whereby an individual 
behaves as if it had compared phenotypic traits (genetically determined) 
against itself or a nestmate template (Holmes and Sherman 1982). 

Another postulated advantage of kin recognition involves behavioral 
avoidance of close inbreeding (Hoogland 1982). Like many amphibians, the 
American toad (Bufo americanus) exhibits site fidelity to natal ponds for 
breeding, and thus individuals are likely to encounter siblings as potential 
mates (Waldman 1991). Can siblings recognize close kin and avoid incestu- 
ous mating? Waldman et al. (1992) monitored mtDNA genotypes in 86 
amplexed pairs of toads and found significantly fewer matings between pos- 
sible siblings (with shared haplotypes) than expected from haplotype fre- 
quencies in the local population, which led the authors to suggest that “sib- 
lings recognize and avoid mating with one another.” They further suggested 
that the proximate cues involved might include advertisement vocalizations 
by males, because resemblance among male calls proved to be positively cor- 
related with genetic relatedness as gauged by band similarities in DNA fin- 
gerprints. Thus, females could potentially employ male vocalizations (or 
other genetically based clues such as odors) in kinship assessment. 


Genetic relationships of specific individuals 


Most of the cases described above entailed estimates of average relatedness 
within and among colonies or social groups, but another approach is to 
attempt assessments of genetic kinship among particular individuals. In one 
classic example, DNA fingerprinting assays were applied to African lions 
that, based on prior field observations, were thought to exist as matriarchal 
groups (Gilbert et al. 1991; Packer et al. 1991). A lion pride typically consists 
of 2-9 adult females, their dependent young, and 2-6 adult males, original- 
ly from outside the group, that have formed a coalition. Incoming males col- 
laborate to evict resident males and often kill resident dependent juveniles. 
From analyses of minisatellite DNA fingerprints gathered from nearly 200 
animals, the following conclusions were reached (Figure 6.2): female com- 
panions within prides proved invariably to be closely related; male coalition 
partners were either closely related (in some larger coalitions) or genetical- 
ly unrelated (mostly in some smaller coalitions involving two or three 
males); and mating partners usually were unrelated. Furthermore, genetic 
parentage analyses revealed that resident males sired all cubs conceived 
during their tenure, and that the variance in male reproductive success 
increased greatly as coalition size increased. From these molecular observa- 
tions, the authors concluded that lion prides are indeed matrilineal, and that 
a coalition male is likely to act as a non-reproductive “helper” only if the 
coalition that he entered includes closely related males. 
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Figure 6.2 Frequency distributions of minisatellite band sharing in Serengeti 
lions. Percentages of bands shared are indicated between (A) females born into the 
same versus different prides; (B) male coalition partners known to have been born 


into the same versus different (or in some cases unknown) prides; and (C) coalition 
males and resident females. (After Packer et al. 1991.) 
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These molecular studies on lions benefited from the fact that long-term 
field observations and pedigrees were sometimes available to calibrate the 
extent of minisatellite band sharing against known or suspected kinship. 
Such analyses also revealed, however, that the relationship between magni- 
tude of DNA band sharing and kinship for pairs of individuals can be non- 
linear, population-specific, and can display a large variance. One technical 
reason may be that complex muiti-locus banding profiles in minisatellite 
fingerprints are notoriously difficult to score (Baker et al. 1992; Prodóhl et al. 
1992; van Pijlen et al. 1991). However, another potential complication in 
assessing kinship, which applies to all types of molecular assays, is biologi- 
cal: DNA profiles reflect kinship attributable not only to contemporary 
pedigrees, but also to earlier demographic histories that may have included 
such factors as population bottlenecks or inbreeding, which can leave last- 
ing genetic signatures (Hoelzel et al. 2002a,b). Thus, kinship is a contextual 
concept, with empirical molecular estimates properly interpreted vis-à-vis 
some stated (or sometimes unstated) baseline that may include deep as well 
as shallow population history. 

A case in point involves the dwarf fox (Urocyon littoralis), which colo- 
nized the Channel Islands off Southern California within the last 20,000 
years. All assayed foxes from a small isolated island (San Nicolas) exhibited 
identical bands in DNA fingerprints, and several other island populations 
showed greatly enhanced levels of band sharing (7596-9596) relative to foxes 
from different islands (1625-5676) and relative to values (10265-3096) typify- 
ing outbred populations in many other vertebrate species (Gilbert et al. 
1990). Thus, the astonishingly high kinship coefficients registered among 
individuals on San Nicholas may well reflect a history of population bottle- 
neck(s) more than nonrandom mating (inbreeding) per se within the island 
population. 

Molecular estimates of pairwise kinship among individuals inevitably 
have a large sampling variance when small numbers of loci are employed. 
Although it is usually quite feasible with small or modest numbers of 
molecular markers to reliably distinguish full sibs (r = 0.50) from half-sibs (r 
= 0.25) or nonrelatives (r = 0.00, in principle), finer meaningful distinctions 
(i.e., within the range of r = 0.00-0.25) remain problematic. Yet there are 
numerous biological settings in which ethologists and other researchers 
would welcome secure genetic knowledge of precise kinship between inter- 
acting individuals, the estimation of which is a current "holy grail" for 
molecular ecology. Thus, it will be extremely interesting to monitor devel- 
opments in kinship assessment in this new era of massive genomic screen- 
ing. With genetic information now obtainable in principle (at least in model 
organisms) from legions of qualitative genotypic markers such as 
microsatellites and SNPs, unprecedented opportunities should arise for 
refining pairwise estimates of individual relationships based on scores or 
even hundreds of independent loci (Glaubitz et al. 2003). Such applications 
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are still in their infancy at the time of this writing, but might someday 
become hugely important in studies addressing such diverse topics as social 
behaviors in nature (e.g., Morin et al. 1994a), mating systems (Heg and van 
Treuren 1998), phenotypic heritabilities (Ritland 2000), and spatial popula- 
tion genetics (see below). 


Geographic Population Structure and Gene Flow 


Populations of nearly all species, social or otherwise, exhibit at least some 
degree of genetic differentiation across geography (Ehrlich and Raven 1969), 
if for no other reasons than because siblings usually begin life near one 
another and their parents and because mating partners seldom represent 
random draws from throughout a species’ geographic range (Turner et al. 
1982). In an influential study of such “population structure” on a microgeo- 
graphic scale, Selander (1970) employed allozyme markers to demonstrate 
fine-scale spatial clustering of genotypes of house mice (Mus musculus) 
within and. among barns on a farm. The spatial variation in this case was 
apparently due to tribal family structure in these mice and genetic drift in 
this small population. 

Population genetic structure sometimes exists even in seemingly 
improbable settings. For example, mosquitofish (Gambusia) are abundant 
and highly dispersive creatures, yet extensive sampling revealed statistical- 
ly significant differences in allozyme frequencies along a few hundred 
meters of shoreline (Kennedy et al. 1985, 1986), as well as significant tem- 
poral variation at particular locales over periods as short as a few weeks 
(McClenaghan et al. 1985). At broader geographic and longer temporal 
scales, mosquitofish populations have shown additional differentiation 
often hierarchically arranged at several levels: across ponds and streams 
within a local area, reservoirs within a river drainage, drainages within a 
region, and regional collections of drainages that house deep genetic differ- 
ences associated with species-level separations perhaps dating to the 
Pleistocene (Scribner and Avise 1993a; M. H. Smith et al. 1989; Wooten et al. 
1988). Various molecular markers have similarly been employed to assess 
geographic population structure due to genetic drift, various forms of selec- 
tion, spatial habitat structure, isolation by distance, social organization, and 
other ecological and evolutionary factors in many hundreds of animal 
species at a wide variety of spatial and temporal scales. 

Populations of most plant species also vary in genetic composition, 
sometimes over microspatial areas of a few kilometers or even meters (Levin 
1979). For example, due in part to a self-fertilization reproductive mode and 
limited gene flow, large populations of wild wheat (Triticum dicoccoides) 
showed pronounced genetic structure over distances of less than 5 km 
(Golenberg 1989). In the grasses Agrostis tenuis and Anthoxanthum odoratum, 
sharp clinal variation was detected in several genetic characters across 
meter-wide ecotones between pastures and lead-zinc mines, as a result of 
strong disruptive selection for heavy metal tolerance and flowering time 
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(Antonovics and Bradshaw 1970; McNeilly and Antonovics 1968). In many 
plant species, gene flow via pollen and seed dispersal is sufficiently limited 
that estimates of neighborhood size (the population within which mating is 
random) often include less than a few hundred individuals occupying areas 
less than 50 m? (Bos et al. 1986; Calahan and Gliddon 1985; Fenster 1991; 
Levin and Kerster 1971, 1974; Smyth and Hamrick 1987). As with animal 
populations, additional genetic structure normally is to be expected over 
greater spatial and temporal scales. 

A continuing challenge is to describe population genetic architectures 
within species (Box 6.2) and to identify and order the biological forces 
responsible. Broadly speaking, these forces may involve migration or gene 
flow (Box 6.3), random genetic drift, various modes of natural selection, 
mutational divergence, and the opportunity for genetic recombination 
mediated by organismal behaviors and mating systems. Finer considera- 
tions require partitioning these general categories into biological factors rel- 
evant to each organismal group. For example, numerous ecological and life 
history factors are predicted to influence population genetic structure in 
plants (Table 6.3). Comparative summaries of the allozyme literature for 
more than a hundred plant taxa revealed that magnitudes of genetic differ- 
entiation are indeed roughly associated with such factors as a species’ 
breeding system, reproductive mode, pollination mechanism, floral mor- 
phology, life cycle, life form, and successional stage (Hamrick and Godt 
1989; Loveless and Hamrick 1984). 

In animals, a comparative summary of allozyme analyses on more than 
300 species (Table 6.4) led Ward et al. (1992) to conclude that mobility tends to 
be especially well correlated with relative magnitudes of population struc- 
ture. For example, vagile organisms such as insects and birds often show sig- 
nificantly less population structure than do relatively sedentary creatures 
such as some reptiles and amphibians. In another meta-review of population 
genetic structure with a similar outcome, Bohonak (1999) found that F,. val- 
ues (as gauged by molecular assays) were negatively correlated with disper- 
sal potential (usually inferred from morphological traits of propagules) in 19 
of 20 animal groups examined. In the sections that follow, a few specific cases 
will highlight how particular ecological and evolutionary factors can impinge 
on population genetic structure as revealed by molecular markers. Where 
possible, attempts will be made to draw meaningful parallels between results 
for taxonomically distinct groups such as plants and animals. 


Autogamous mating systems 


PLANTS. Ina paradigmatic series of studies employing allozyme markers, 
Allard and colleagues documented that the mating system can assume dra- 
matic influence, especially in conjunction with natural selection, in shaping 
the multi-locus genetic architectures of plant species. The slender wild oat 
(Avena barbata) is a predominantly self-fertilizing species that was intro- 
duced to California from its native range in the Mediterranean during the 
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TABLE 6.4 Comparative summary of population structures for 321 animal species 
surveyed by muliti-locus protein electrophoresis 





Population differences? 








Taxonomic group (with SE) Number of species 

Vertebrates 
Mammals 0.242 + 0.030 57 
Birds 0.076 + 0.020 16 
Reptiles 0.258 + 0.050 22 
Amphibians 0.315 + 0.040 33 
Fishes 0.135 + 0.040 79 
TOTAL 0.202 + 0.015 207 

Invertebrates 
Insects 0.097 + 0.015 46 
Crustaceans 0.169 + 0.061 19 
Mollusks 0.263 + 0.036 44 
Others 0.060 + 0.021 5 
TOTAL 0.171 + 0.020 114 





Source: After Ward et al. 1992. 

* Shown are proportions of total genetic variation within species due to genetic differences 
between geographic populations, as reflected in the “coefficient of gene differentiation,” 
(H, - Hj) /Hz, where H, and H, are mean heterozygosities estimated within local popula- 
tions and within the entire species, respectively (Nei 1973). 


BOX 6.3 Genetic Exchange among Populations 





Gene flow is the transfer of genetic material between populations resulting ` 
from movements of individuals or their gametes. Usually, gene flow is. 
expressed as a migration rate m, defined as the proportion of alleles in a popu- 
lation that is of migrant origin each generation. Gene flow is notoriously diffi- 
cult to monitor directly, but it is commonly inferred from spatial distributions 
of genetic markers by several statistical approaches. Most of these approaches 
are based on equilibrium expectations derived from neutrality theory as 
applied to idealized models of population structure.,Examples include the 
“island model,” wherein a species is subdivided into equal-sized populations 
(demes or islands of size N), all of which exchange alleles with equal probabili- 
ty; and the “stepping-stone” model, wherein gene flow occurs between adja- 
cent demes only. Allele frequencies i in finite populations are also influenced by 
random genetic drift, which is a function of effective population size (see Box 
2.2). Thus, the influences of drift and gene flow are difficult to tease apart, and 
most statistical procedures estimate only the product Nm, which can be inter- 
preted as the absolute number of individuals exchanged between populations 
per generation. Also, Nm is of particular interest because under neutrality theo- 
ry, the level of divergence among populations that are at equilibrium between 
gene flow and genetic drift is a function of migrant numbers rather than of the 
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proportions of individuals exchanged. The most common approaches to esti- 
mating Nm and gene flow from molecular data are as follows: 


1. From F-statistics (Cockerham and Weir 1993). Wright (1951) showed that for 
neutral alleles in an island model, equilibrium expectations are 


i Foy = 1/(1 + ANm) 
or, equivalently, 


Nm = (1 - Feq)/4F gp (6.8) 


Nei (1973) defined a related measure of between-population heterogeneity. 
(gene diversity, or Gg) that bears the same relationship to Nm and also is 
employed widely. Takahata and Palumbi (1985) suggested modifications of 
these basic statistics for extra-nuclear haploid genomes such as mtDNA, 
and Lynch and Crease (1990) proposed an analogue of the Fs; or Ger indices 
(Nez) that is applicable to data at the nucleotide level. 


2. From private alleles. Private alleles are those found in only one population. 
For a variety of simulated populations, Slatkin (1985a) showed by computer 
analyses that the natural logarithm of the average frequency of private alle- 
les (p(1)] is related to the natural logarithm of Nm according to : 


In p(1) = -0.505 In (Nm) - 2.44 


or, equivalently, 
Nm = e n p(1) + 2.44)/0.505] (6.9) 


This result proved insensitive to most changes in parameters of the model, 
except that a correction for Nm due to differences in the mean number of 
individuals sampled per population was recommended (Barton and Slatkin 
1986). The rationale underlying Slatkin's method is that private alleles are 
likely to attain high frequency only when Nm is low. In practice, when suffi- 
cient genetic information is available, the F; and private allele methods are 
expected to yield comparable estimates of gene flow under a wide variety 
of population conditions (Slatkin and Barton 1989). 


3. From allelic phylogenies. Unlike the two approaches described above, which ; 
can be applied to phylogenetically unordered alleles (such as those provid- 
ed by allozymes), this method requires knowledge of the phylogeny. of non- 
recombining segments of DNA (such as mtDNA haplotypes). Given the cor- 
rect gene tree and knowledge of the geographic populations in which the 
allelic clades are found, a parsimony criterion is applied to estimate the 
minimum number of migration events (s) consistent with the phylogeny. ' 
Slatkin and Maddison (1989) showed that the distribution of this minimum 
number is a simple function of Nm, which therefore can be estimated from 
empirical data by comparison with tabulated results from their computer- 
simulated populations. à 
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invertebrates, several genetic studies have reported a correspondence 
between increased potential for larval dispersal and diminished genetic 
differentiation among geographic populations (Ayre et al. 1997; Berger 
1973; Crisp 1978; Gooch 1975; Liu et al. 1991). For example, a non-plank- 
tonic egg-casing snail, Nucella canaliculata (Sanford et al. 2003), and a larval- 
brooding snail, Littorina saxatilis, showed pronounced population structure 
in molecular markers that contrasted with the less structured pattern 
observed in a free-spawning marine snail, L. littorea (Janson 1987). In sea 
urchins of the genus Heliocidaris, one species (H. tuberculata) with a several- 
week planktonic larval stage showed little differentiation in mtDNA geno- 
types between populations separated by 1,000 km of open ocean, whereas 
populations of a congener (H. erythrogramma) with only a 3- to 4-day plank- 
tonic larval duration were strongly partitioned over comparable geograph- 
ic scales (McMillan et al. 1992). In a solitary coral species that broods its lar- 
vae (Balanophyllia elegans), allozyme population structure along the 
California coast proved to be substantially greater than that in a co-distrib- 
uted solitary coral species (Paracyathus stearnsii) with planktonic larvae 
(Hellberg 1996). Likewise, in comparative allozyme surveys of nine coral 
species in the genera Acropora, Pocillopora, Seriatopora, and Stylophora, pop- 
ulation genetic structure along Australia’s Great Barrier Reef usually (but 
not invariably) proved to be somewhat greater in brooding species than in 
broadcast spawners (Ayre and Hughes 2000). 

Among the vertebrates, Pacific damselfishes with pelagic larvae 
showed allozyme uniformity over huge areas, whereas one assayed species 
that lacks a pelagic larval phase (Acanthochromis polyacanthus) was highly 
structured genetically (Ehrlich 1975; Planes and Doherty 1997). Another 
marine fish that lacks a pelagic phase, the black surfperch (Embiotoca jack- 
soni), likewise shows strong geographic population structure, as evidenced 
in this case by mtDNA (Bernardi 2000; Doherty et al. 1995). Waples (1987) 
assessed allozyme differentiation in several species of marine shore fishes 
sampled along the same geographic transect in the eastern Pacific and 
reported a strong negative correlation with dispersal capability as inferred 
from planktonic larval durations (Figure 6.3B): The species with the lowest 
potential for dispersal (a livebearer with no pelagic larval stage, Embiotoca 
jacksoni) exhibited the highest spatial genetic structure, whereas the species 
with the highest dispersal potential (a fish associated with drifting kelp and 
characterized by an extended larval duration, Medialuna californiensis) exhib- 
ited no detectable spatial genetic differentiation. Such results also appear to 
be generally consistent with the long-noted tendency for marine species 
with dispersive larvae to rapidly colonize oceanic islands and to exhibit 
broader geographic ranges than those with sedentary larvae (Jablonski 1986; 
Thorson 1961; but see Thresher and Brothers 1985 for exceptions). 

Population genetic structures in North Atlantic eels have attracted par- 
ticular interest because of the extraordinary catadromous life histories of 
these species (see review in Avise 2003b). Juvenile eels (Anguilla rostrata in 
the Americas, A. anguilla in Europe) inhabit coastal and inland waters for 
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most of their lives, but during sexual maturation they migrate to the west- 
ern tropical mid-Atlantic Ocean, where spawning takes place. Conventional 
wisdom (reviewed by Williams and Koehn 1984) was that conspecific larvae 
produced from each suspected mass spawn passively disperse via ocean 
currents to continental margins, perhaps settling at locales randomly ori- 
ented with respect to the homesteads of their parents. If mating is indeed 
quasi-panmictic and larval dispersal is passive, then all continental popula- 
tions could represent nearly random draws from the species’ gene pool, and 
accordingly would lack appreciable spatial genetic structure, Molecular 
data for A. rostrata and A. anguilla collected throughout their respective con- 
tinental ranges are roughly consistent with this scenario. Several studies of 
A. rostrata from across eastern North America have documented a near or 
total absence of spatial structure in mtDNA and in polymorphic allozymes 
and microsatellite loci (Avise et al. 1986; Koehn and Williams 1978; Mank 
and Avise 2003; Williams et al. 1973; Wirth and Bernatchez 2003). For A. 
anguilla sampled across Europe, population genetic structure also appears 
slight (albeit statistically significant) at microsatellite loci (Fa, = 0.002) and in 
mtDNA (Lintas et al. 1998; Maes et al. 2002; Wirth and Bernatchez 2001). 

In contrast, American and European eels are clearly distinct genetically, 
confirming the much-debated presence of at least two largely independent 
gene pools in the North Atlantic (see review in Avise 2003b), Additional 
genetic analysis also revealed the possible low-frequency presence of 
hybrids between A. rostrata and A. anguilla in Iceland (Avise et al. 1990b). 
This island is longitudinally intermediate to North America and Europe and 
is thousands of kilometers from where the zygotes presumably arose, so 
these genetic findings raise the intriguing possibility that hybrid larvae, if 
they truly exist (more definitive genetic data are needed), might have inter- 
mediate migratory behavior. 

In general, long-duration planktonic larvae (as well as highly mobile 
adults in many marine taxa) afford opportunities for extensive gene flow, 
and such potential appears to have been realized in diverse species of 
marine invertebrates and vertebrates, as evidenced by a paucity of allozyme 
or mtDNA differentiation over vast areas. This is true, for example, among 
populations of several sea urchin species in the genera Echinothrix and 
Strongylocentrotus in long transects across parts of the Pacific Ocean (Lessios 
et al. 1998; Palumbi and Wilson 1990); among populations of rock lobster 
(Jasus edwardsii) across 4,600 km of Australasian habitat (Ovenden et al. 
1992); in tiger prawns (Penaeus monodon) throughout the southwestern 
Indian Ocean (Forbes et al. 1999); in abyssal mussels (Bathymodiolus ther- 
mophilus) from hydrothermal vents scattered across the eastern Pacific 
(Craddock et al. 1995); within each of several species of Caribbean reef fish- 
es from locales as much as 1,000 km apart (Lacson 1992; Shulman and 
Bermingham 1995); in walleye pollack (Theragra chalcogramma) sampled 
throughout the Bering Sea (Shields and Gust 1995); among damselfish 
(Stegastes fasciolatus) populations throughout the 2,500-km Hawaiian archi- 
pelago (Shaklee 1984); among milkfish (Chanos chanos) populations from 
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al. 2000) have all demonstrated that contemporary generation-to-generation 
connections among populations of many species with planktonic larvae 
may be far lower than traditionally supposed. At least over time frames of 
perhaps hundreds to thousands of generations, most larval recruitment may 
be at a sufficiently local scale to permit substantial genetic differentiation 
among conspecific populations. One final example involves the cleaner 
goby (Elacatinus evelynae), which, despite an extended pelagic duration of 21 
days, showed strong geographic structure in mtDNA across the Caribbean 
(Taylor and Hellberg 2003). 

Conversely, diversifying natural selection acting on particular loci via 
differential survival or mating success might sometimes convey a false 
impression of low gene flow among highly connected populations. For 
example, in the blue mussel (Mytilus edulis), allele frequencies at a leucine 
aminopeptidase (Lap) allozyme locus are significantly heterogeneous spa- 
tially, but are strongly correlated with environmental salinity. Physiological 
and biochemical studies have indicated that these alleles function differen- 
tially in relieving osmotic stress in environments of varying salinity via their 
influence on the free amino acid pools and volumes of cells (Hilbish et al. 
1982). Thus, frequencies of these non-neutral Lap alleles probably say more 
about environmental conditions than about the gene flow regime of the 
species (Boyer 1974; Koehn 1978; Theisen 1978). At other polymorphic 
allozyme loci, these same mussel populations exhibited large, moderate, 
and small inter-population variances in allele frequencies (Koehn et al. 
1976), such that estimates of gene flow under assumptions of neutrality dif- 
fered considerably across genes. 

A sobering example of how different genetic markers can sometimes 
paint contrasting pictures of gene flow involves populations of the 
American oyster (Crassostrea virginica) from the Gulf of Mexico and Atlantic 
coasts of the southeastern United States. Surveys of polymorphic allozymes 
revealed a near uniformity of allele frequencies throughout this range 
(Figure 6.4A), a result understandably attributed to high gene flow resulting 
from "the rather long planktonic stage of larval development, since this 
species has the ability to disperse zygotes over great distances when facili- 
tated by tidal cycles and oceanic currents" (Buroker 1983). However, 
mtDNA genotypes revealed a dramatic genetic "break," involving cumula- 
tive and nearly fixed mutational differences that cleanly distinguished most 
Atlantic from Gulf oyster populations (Reeb and Avise 1990). Subsequent 
surveys of nuclear DNA markers tended to support the dramatic 
Atlantic/Gulf mtDNA dichotomy (Karl and Avise 1992; Hare and Avise 
1996, 1998; Figure 6.4B) and thus seem to eliminate differences in dispersal 
of male versus female gametes as a likely explanation for the contrasting 
population structures registered by allozymes and mtDNA. One possibility 
is that some of the allozyme loci surveyed may be under uniform balancing 
selection and thus do not register the population subdivision that seems 
clearly evidenced by multiple DNA markers in the nucleus and cytoplasm 
(Karl and Avise 1992). This suggestion may also be consistent with the long- 
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Figure6.4 Allele frequencies in oyster populations along a coastal transect from 
Massachusetts through South Carolina, Georgia, Florida, and Louisiana. Shown are 
frequencies of the most common alleles at (A) five polymorphic allozyme loci: Est1, 
Lap1, 6Pgd, Pgi, and Pgm (data from Buroker 1983), and (B) five loci assayed at the 
DNA level: mtDNA (heavy line; data from Reeb and Avise 1990) and four anony- 
mous single-copy nuclear genes. (After Karl and Avise 1992.) 








standing observation that allozyme heterozygosities in mollusks are strong- 
ly associated with presumed fitness components such as metabolic efficien- 
cy and growth rate (Garton et al. 1984; Zouros et al. 1980; see also Hare et al. 
1996). Whether this explanation or its converse (that allozymes faithfully 
register high gene flow in oysters, but mtDNA and some nDNA markers 
differ between the Atlantic and Gulf coasts because of diversifying selec- 
tion) is correct, the conclusion is that natural selection probably has acted on 
at least some of the genetic markers. This finding underlines the ever-pres- 
ent need for caution in inferring population structure and gene flow under 
an assumption of selective neutrality for all molecular markers. 
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PHYSICAL DISPERSAL BARRIERS. Bluegill sunfish (Lepomis macrochirus) are 
active swimmers, abundant throughout their freshwater range in North 
America. An allozyme survey of 2,560 specimens divided equally among 64 
localities (eight sites per reservoir, four reservoirs in each of two adjacent 
river drainages) revealed that about 90% of the total allele frequency vari- 
ance occurred between reservoirs in a drainage, whereas within reservoirs 
(which ranged in size up to more than 100,000 acres) allele frequencies sel- 
dom were significantly heterogeneous (Avise and Felley 1979). Clearly, the 
subdivided structure of the physical environment (reservoirs separated by 
dams) had imposed a corresponding genetic structure on these otherwise 
highly mobile fish. 

Gyllensten (1985) reviewed allozyme literature on geographic popula- 
tion structure within each of 19 fish species characterized by lifestyle: strict- 
ly freshwater, anadromous, and marine. The average percentages of total 
intraspecific gene diversity that were distributed among locales (as opposed 
to within them) increased dramatically in the following order by habitat: 
marine taxa (1.6%), anadromous species (3.7%), and freshwater species 
(29.4%). Thus, differences in spatial distributions of genetic variability gen- 
erally coincided with qualitative differences in the occurrence of obvious 
geographic barriers to movement. Few trends in population genetics are 
without exception, however, and molecular analyses of several marine and 
anadromous species have sometimes documented levels of range-wide 
population structure that are quite comparable to those typifying many 
freshwater fish species (e.g., Avise et al. 1987b; Bowen and Avise 1990). 

In a flightless water strider (Aquarius remigis) that migrates by rowing on 
water surfaces, an allozyme survey by Preziosi and Fairbairn (1992) revealed 
that whereas populations distributed along a given stream are nearly undif- 
ferentiated (Fs, = 0.01), those inhabiting different streams in a watershed are 
highly structured (F,, = 0.46). By contrast, a water strider species (Limnoporus 
canaliculatus) with functional wings exhibited nearly homogeneous allele fre- 
quencies throughout several Atlantic seaboard states (Zera 1981). An 
mtDNA survey has also been conducted on open-ocean water striders, or 
sea-skaters (Halobates spp.), one of the few insect groups to have invaded the 
marine environment. Although the data are not extensive, they suggest that 
population genetic structure in these species may be partitioned primarily on 
the spatial scale of large oceanic regions (Andersen et al. 2000). Collectively, 
these available results on aquatic Hemiptera suggest that inherent dispersal 
capacities, in conjunction with the physical nature of the environment, exert 
a huge influence on species’ population genetic structures. 

On the other hand, genetic comparisons of population structure in five 
species of carabid beetles revealed no correlation with degree of flight-wing 
development (ranging from vestigial to fully winged). A positive correlation 
was noted, however, between F,, values and the elevations of the collecting 
sites (Liebherr 1988), suggesting in this case that habitat fragmentation (of 
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highland sites) is more important than dispersal capability alone in molding 
population genetic structures in these beetles. 

Numerous other species likewise occupy discontinuous habitats and 
may show significant population genetic structures related to environmental 
patchiness, which often overrides their normal dispersal capabilities. For 
example, troglobitic (obligate cave-dwelling) crickets in the genera 
Hadenoecus and Euhadenoecus were shown to exhibit greater allozyme popu- 
lation structure in their isolated pockets of habitat (different cave systems) 
than did their epigean (surface-dwelling) counterparts (Caccone and 
Sbordoni 1987). Similarly, for mice on islands, even narrow ocean channels 
must be huge hurdles to dispersal, and small island populations of 
Peromyscus do indeed often show less within-population variability and 
greater between-island genetic differences than do their mainland counter- 
parts over comparable geographic scales (Ashley and Wills 1987, 1989; Avise 
et al. 1974; Selander et al. 1971). For any habitat specialist, suitable environ- 
ments may be scattered. To pick one more setting as a final example, granite 
outcrops are scattered across the southeastern United States like small 
islands in a matrix of mesophytic forest. They house several endemic species, 
such as the beetle Collops georgianus, whose populations proved to display far 
more pronounced genetic structure among outcrops (Fs, = 0.19) than within 
them (Fs; = 0.01) (King 1987). Similar population genetic patterns have been- 
documented by molecular markers in a variety of plant species endemic to 
these isolated patches of rock (Wyatt et al. 1992; Wyatt 1997). 


PHILOPATRY TO NATAL SITE. Each reproductive season, female marine tur- 
tles typically migrate hundreds or thousands of kilometers from foraging 
grounds to nesting locales, where they deposit eggs on sandy beaches. For 
example, green turtles (Chelonia mydas) that nest on Ascension (a.small, iso- 
lated island on the mid-Atlantic oceanic ridge) otherwise inhabit feeding 
pastures along the coast of Brazil, some 2,000 km distant. From repeated 
captures of physically tagged adults, it was long known that green turtles 
exhibit strong nest site fidelity; that is, Ascension females nest on Ascension 
and nowhere else, Costa Rican and Venezuelan nesters are faithful to their 
respective rookeries, and so on. What remained unknown was whether the 
site to which a female is fidelic as an adult was also her natal rookery. If 
female “natal homing” prevails, most rookeries should exhibit clear genet- 
ic differences from one another with regard to matrilines (and hence 
mtDNA), even if appreciable inter-rookery exchange of nuclear genes 
occurs via the mating system and male-mediated gene flow (Karl et al. 
1992). In the first genetic surveys of green turtle rookeries around the world 
(Bowen et al. 1992; Meylan et al. 1990), a fundamental split in mtDNA 
genealogy was found to distinguish all surveyed specimens in the 
Atlantic-Mediterranean from those in the Indian-Pacific, and pronounced 
genetic substructure also proved to characterize rookeries within each 
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ocean basin (Figure 6.5). Indeed, distinct mtDNA haplotypes completely 
(or nearly) distinguished many pairs of nesting colonies within an ocean 
basin, a finding indicative of a strong propensity for natal homing by nest- 
ing females. 

Following these pioneering studies, similar population genetic surveys 
have been conducted on most of the world's seven or eight marine turtle 
species (see reviews in Avise 2000a; Bowen and Avise 1996; Bowen and Karl 
1997). These surveys include further molecular analyses of green turtles 
(e.g., Encalada et al. 1996) as well as hawksbills (Eretmochelys imbricata; 
Broderick et al. 1994; Bass et al. 1996), loggerheads (Caretta caretta; Bowen et 
al. 1993b, 1994; Encalada et al. 1997), and ridleys (Lepidochelys species; 
Bowen et al. 1998). The empirical findings, often qualitatively paralleling 
those described above, exemplify how even some of the world's most high- 
ly mobile species nonetheless can display dramatic matrilineal population 
structures, due in this case to both geographic constraints (e.g., physical bar- 
riers between oceans) and inherent natal homing behaviors (to particular 
rookeries within oceanic basins). 

Whales too are impressive mariners, normally traveling many thou- 
sands of kilometers seasonally. Several analyses of mtDNA (and nDNA) 
from skin biopsies of humpback whales (Megaptera novaeangliae) sampled 
globally have found genetic differences between various groups, including 
those previously reported to show distinct migration routes within an ocean 
basin between summer feeding grounds in subpolar or temperate environs 
and winter breeding areas in the tropics (Baker et al. 1990, 1993, 1994, 1998; 
Larsen et al. 1996; Palsbell et al. 1995, 1997b). Such spatial partitioning of 
matrilineal genotypes appears due in large part to female-directed fidelity 
to specific migratory destinations. Several other cetacean species have simi- 
larly been shown to be subdivided into matrilineal groups through which 
cultural traditions are passed (Whitehead 1998). These results illustrate how 
social behaviors can be another factor promoting population genetic struc- 
ture in highly mobile marine animals. 

Salmon are active and powerful swimmers, but also are notorious for 
suspected natal homing propensity, in this case by both sexes. In anadro- 
mous forms of these species, juveniles spawned in freshwater streams 
migrate to the sea before returning to their natal stream years later as adults 
to complete the life cycle. Numerous surveys of nuclear genes (e.g., via 
allozymes, microsatellites) and mtDNA from both Atlantic and Pacific 
species have revealed significant genetic differences among spawning pop- 
ulations at various microspatial, mesospatial, and macrogeographic scales 
(some early studies were by Billington and Hebert 1991; Ferguson 1989; 
Gyllensten and Wilson 1987a; Ryman 1983; and Stahl 1987). Small or mod- 
est allele frequency shifts often characterize spawning populations within 
and among nearby drainages (e.g., Banks et al. 2000; Laikre et al. 2002; J. L. 
Nielsen et al. 1997; Scribner et al. 1998; G. M. Wilson et al. 1987), or even 
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Figure 6.5 Phenogram summarizing relationships among 226 sampled nests of 
the green turtle. To conserve space, sequence divergence (p) axes on the bottom are 
presented as mirror images centered around the root leading to two distinct clonal 
assemblages (Atlantic-Mediterranean versus Indian-Pacific ocean basins). (After 
Bowen et al. 1992.) 
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al. 1995; Maynard Smith 1990; but see Bishop et al. 1985). Thus, most molec- 
ular analyses of gender-biased dispersal have relied instead on mtDNA data 
(indicative of matrilineal history) interpreted in conjunction with population 
genetic information from autosomal markers such as allozymes or 
microsatellites (indicative of biparental histories). 

A case in point involves macaque (Macaca) monkeys, in which mirror- 
image patterns of geographic variation have been reported in nuclear- 
encoded allozymes versus mtDNA. Male macaques typically leave their 
natal group before reaching sexual maturity, whereas females remain for 
life. Melnick and Hoelzer (1992) reviewed the literature on molecular varia- 
tion in several macaque species (M. fascicularis, M. mulatta, M. nemestrina, 
and M. sinica) and reported patterns of geographic population structure that 
are consistent with these gender-specific behaviors. For example, in the 
nuclear genome of M. mulatta, only 9% of total intraspecific diversity proved 
attributable to variation among geographic locales, whereas 91% of overall 
diversity in the mitochondrial genome occurred between populations. Thus, 
spatial genetic patterns registered by these two genomes are "intimately 
linked to the asymmetrical dispersal patterns of males and females and the 
maternal inheritance of mtDNA" (Melnick and Hoelzer 1992). Furthermore, 
as shown by subsequent DNA sequence analyses, bifurcations in macaque 
mtDNA gene trees typically predate Y chromosome divergences at the same 
phylogenetic nodes, as might be expected for these female-philopatric ani- 
mals (Tosi et al. 2003). 

A similar genetic pattern of extreme sex-biased dispersal has been 
reported in a communally breeding, nonmigratory bat (Myotis bechsteinii). 
Based on a comparison of mitochondrial and nuclear microsatellites, almost 
complete separation was uncovered in mtDNA markers due to near- 
absolute female philopatry, despite extensive male dispersal that had pro- 
duced only a weak (albeit statistically significant) population genetic struc- 
ture at nuclear loci (Kerth et al. 2002). 

Molecular studies of at least one avian species have reported exactly the 
opposite population genetic pattern, suggesting sex-biased dispersal in 
favor of females (rather than males). In red grouse (Lagopus lagopus) from 
northeastern Scotland, molecular analyses of 14 populations revealed sig- 
nificant spatial structure in nuclear microsatellite markers but not in 
mtDNA (Piertney et al. 2000). Although at first thought these findings might 
seem contradictory (because female-mediated gene flow would move 
nuclear as well as mitochondrial markers), the authors identified theoretical 
models under which these outcomes are plausible, provided that specifiable 
differences in the dispersal and ecology of males versus females are such 
that local effective population sizes of the nuclear genome (more so than the 
mitochondrial genome) are severely reduced (see Piertney et al. 2000). 

Another possible example of distinctive genetic signatures resulting 
from gender-based differences in behavior involves the green turtle 
(Chelonia mydas). As already mentioned, most rookeries within an ocean 
basin are strongly isolated with regard to mtDNA lineages (mean inferred 
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Nm = 0.3), indicating a strong propensity for natal homing by females 
(Bowen et al. 1992). However, these same rookeries proved to be somewhat 
less differentiated at assayed nuclear loci (mean Nm = 1.7; Karl et al. 1992), 
perhaps because of occasional male-mediated gene flow. Green turtles are 
known to mate at sea, often on feeding grounds or other locales spatially 
removed from the nesting sites. Thus, inter-rookery matings could provide 
an avenue for nuclear gene exchange that largely is closed to mtDNA 
because of female natal homing. Other marine organisms that have shown 
matrifocal genetic arrangements and contrasting population structures in 
cytoplasmic versus nuclear loci include not only various cetaceans (as 
already mentioned) but also some species of pinnipeds (sea lions and allies). 
The southern elephant seal (Mirounga leonina), for example, displays signif- 
icantly greater population structure in mtDNA markers than in nDNA 
markers across the southern oceans (Hoelzel et al. 2001; Slade et al. 1998). 

In interpreting such molecular contrasts, one potential complication is 
the theoretical fourfold lower effective population size for uniparentally than 
for biparentally transmitted genes. Thus, all else being equal, mtDNA is more 
subject to genetic drift effects, which also can promote the emergence of 
salient population structure. This factor can be taken into account in data 
analyses, as illustrated by Wilmer et al. (1999) when they documented high- 
er population subdivision in the Australian ghost bat (Macroderma gigas) for 
mtDNA than for nuclear markers even after factoring in the expected differ- 
ence in N, for these two sets of genes. Nonetheless, whenever possible, it is 
also desirable to gauge dispersal by the two sexes not only indirectly via 
assessment of population genetic structure, but also from more direct obser- 
vational evidence (as has been done for southern elephant seals; Fabiani et al. 
2003). As described next, comparisons between indirect (genetic) and direct 
contemporary appraisals of dispersal in the field are often far more informa- 
tive than either form of evidence interpreted in isolation. 

Many waterfowl provide exceptions to the prevalent pattern of male- 
biased philopatry in birds. In the lesser snow goose (Chen caerulescens), as in 
some other migratory waterfowl, pair formation occurs on wintering 
grounds, where birds from different nesting areas often gather in mixed 
assemblages. Then a mated pair normally returns to the female's natal or 
prior nesting area. Among all avian species for which direct banding returns 
are available (Cooke et al. 1975), according to P. J. Greenwood (1980), “the 
lesser snow goose is the best documented example of male biased natal and 
breeding dispersal." This natural history pattern suggests considerable inter- 
colony gene flow mediated by males, an expectation consistent with results 
of both allozyme (Cooke et al. 1988) and nRFLP studies (Quinn 1988; Quinn 
and White 1987), This behavior also suggests that colonies should be isolat- 
ed with regard to matriarchal lineages, but surprisingly, this has not proved 
to be the case. In an mtDNA survey of 160 geese from colonies across the 
breeding range (from Russia to the eastern Canadian Arctic), no significant 
differences were observed in the spatial frequencies of two major mtDNA 
clades, a result indicative of considerable population connectivity and gene 


ee SSS 


276 Chapter 6 


flow involving females (Avise et al. 1992b; Quinn 1992). One likely explana- 
tion is that the entire current range of the snow goose was colonized recent- 
ly from expansion out of Pleistocene refugia, where separation between the 
two mtDNA clades may have been initiated. A related possibility is ongoing 
gene flow, either via occasional lapses in philopatry by females (a phenome- 
non that has been documented by direct banding returns) or via episodic 
pulses of mass movement of individuals during periods of colony perturba- 
tion (also suspected from field observations). Whatever the process, snow 
goose colonies must have been in recent matrilineal contact notwithstanding 
the propensity for natal philopatry by females. 

From these comparisons of banding and genetic data for snow geese, 
two important object lessons emerged: that direct behavioral or marking 
studies on contemporary populations can in some cases provide a mislead- 
ing picture of the geographic distributions of genetic traits because they fail 
to reveal the important evolutionary aspects of population connectivity 
revealed ín genes; and conversely, that geographic distributions of genetic 
markers can in some cases provide a misleading picture of contemporary 
dispersal and gene flow because they retain a record of evolutionary events 
and demographic parameters that may differ from those of the present. 
Thus, a full appreciation of geographic population structure in any species 
requires an integration of evolutionary (genetic) and contemporary (behav- 
ioral) perspectives. 

It is also true, however, that some waterfowl populations have shown 
striking matrilineal differentiation. In the spectacled eider (Somateria fisheri), 
mtDNA markers revealed much higher regional population structure than 
did sex-linked and autosomal microsatellite loci (Scribner et al. 2001b). From 
these genetic data, the authors estimated that per generation rates of inter- 
regional gene flow were almost 35 times greater for males than for females 
(1.28 x 10? and 3.67 x 1075, respectively). Male-biased dispersal and gene 
flow have also been genetically deduced in some passeriform species, such 
as the yellow warbler (Dendroica petechia; Gibbs et al. 2000b) and red-bellied 
quelea (Quelea quelea; Dallimer et al. 2002). 

Invertebrates also have be the subject of critical molecular analyses of 
gender-asymmetric dispersal. Africanized "killer" bees are aggressive forms 
of Apis mellifera that spread rapidly in the New World following the intro- 
duction of African honeybees into Brazil in the late 1950s. Two competing 
hypotheses were advanced for their mode of spread and the composition of 
their colonies. Perhaps queens are sedentary, such that most of the geo- 
graphic expansion in aggressive behavior has resulted from gene flow medi- 
ated by drones. Under this hypothesis, males might travel considerable dis- 
tances and mate with the docile honeybees of European ancestry that for- 
merly constituted domesticated hives in the Americas. Alternatively, per- 
haps gene flow has resulted from colony swarming, a mechanism of mater- 
nal migration wherein a queen and some of her workers leave a hive and fly 
elsewhere to establish a new colony. Under this hypothesis, hybridization 
with domesticated European bees is not required. 
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Molecular analyses illuminated the issue by first demonstrating the 
involvement of colony swarming: Surveyed colonies of Africanized bees in 
the Neotropics often proved to carry African-type (as opposed to European- 
type) mtDNA (Hall and Muralidharan 1989; Hall and Smith 1991; D. R. 
Smith et al. 1989). Furthermore, allozymes and other nDNA markers 
showed that African and European honeybees had hybridized in the 
Neotropics, at least occasionally, and that this hybridization led to intro- 
gression of nuclear genes as part of the Africanization process, albeit to an 
argued degree (Hall 1990; Lobo et al. 1989; Rinderer et al. 1991; Sheppard et 
al. 1991). More recently, an intensive molecular investigation into the 
Africanization process in Mexico's Yucatán Peninsula has been reported 
(Clarke et al. 2002). Bees of African ancestry first arrived there in 1986. Based 
on analyses of mitochondrial and nuclear microsatellite markers that distin- 
guish African from European forms, the genetic composition of Yucatán 
populations changed dramatically in the ensuing 15 years. By 1989, sub- 
stantial paternal gene flow from invading Africanized drones had occurred, 
but maternal gene flow was negligible. By 1998, however, a radical shift had 
occurred, such that African nuclear alleles (6596) and African-derived 
mtDNA (61%) both predominated in the formerly European colonies. 

Dispersal can also be sex-biased in many plants, notably due to the fact 
that pollen tends to be far more dispersive than seeds. One net consequence 
in such species is a greater opportunity for the spread of nuclear alleles than 
of maternally transmitted cytoplasmic alleles. Using cpDNA markers, often 
in conjunction with those provided by nuclear DNA or allozyme loci, such 
possibilities have been investigated and sometimes (not invariably) docu- 
mented in several plant species (Grivet and Petit 2002; Latta and Mitton 
1997; McCauley et al. 1996; McCauley 1998; Oddou-Muratorio et al. 2001). 


Non-neutrality of some molecular markers 


Lewontin and Krakauer (1973) pointed out that one expected signature of 
natural selection on genetic markers is the appearance of significant hetero- 
geneity across loci in allele frequency variances among geographic popula- 
tions. In theory, genetic drift, gene flow, and the breeding structure of a 
species should affect all neutral autosomal loci in a similar fashion, so differ- 
ent population genetic patterns across loci might signify either that allele fre- 
quencies at geographically variable loci are under diversifying selection 
(despite high gene flow as evidenced by geographically uniform genes), or 
that allele frequencies at geographically uniform loci are under stabilizing or 
equilibrium selection (despite low gene flow as evidenced by heterogeneous 
allele frequencies at geographically variable loci). Lewontin and Krakauer 
applied this reasoning to suggest that natural selection had acted on at least 
some human blood group polymorphisms (Cavalli-Sforza 1966), which on a 
global scale showed allele frequency variances spanning a wide range (Fs = 
0.03 to Fa, = 0.38). The "Lewontin-Krakauer" test subsequently was criti- 
cized on the grounds that its statistical methods seriously underestimated 
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variances in gene frequencies expected under the null (neutral) theory (Nei 
and Maruyama 1975; Robertson 1975; see also Lewontin and Krakauer 1975). 
Nevertheless, it remains true that different loci within a species can some- 
times paint very different pictures of population structure and gene flow, and 
that some of these patterns can be strongly suggestive of various departures 
from selective neutrality. 

A noteworthy early example involved the deer mouse (Peromyscus man- 
iculatus). In allozyme surveys of populations from across North America, 
Fs, values at six polymorphic loci ranged from 0.04 (inferred Nm = 6.0) to 
0.38 (Nm = 0.4) (Avise et al. 1979c). Especially remarkable was the observa- 
tion that surveyed populations from central Mexico to northern Canada and 
from the Pacific coast to the Atlantic all exhibited roughly similar frequen- 
cies (Fg, = 0.05) of the same two electromorphs at the aspartate aminotrans- 
ferase (Got-1 or Aat-1) locus. Subsequent screening by varied electrophoret- 
ic techniques and other discriminatory assays failed to reveal any apprecia- 
ble "hidden protein variation" within these two Aat-1 electromorph classes 
(Aquadro and Avise 1982b): Yet this relative geographic near-homogeneity 
at Aat-1 contrasts sharply with the extreme geographic heterogeneity exhib- 
ited by this species in morphology, ecology, karyotype, and mtDNA 
sequence (Baker 1968; Blair 1950; Bowers et al. 1973; Lansman et al. 1983). 
For example, the number of acrocentric chromosomes ranges from 4 to 20 
across populations (Bowers et al. 1973), and regional populations often 
show deep historical subdivisions involving cumulative and fixed differ- 
ences in mtDNA (Lansman et al, 1983). It is difficult to escape the conclusion 
that Aat-1 provides a serious underestimate of the overall magnitude of 
population genetic structure in this species. One theoretical possibility is 
that geographically uniform selection somehow balances Aaft-1 allele fre- 
quencies despite severe historical and contemporary restrictions on gene 
flow apparently registered by numerous other genetic traits. 

The converse of this situation may apply to a classically studied 
allozyme polymorphism in Drosophila melanogaster. The main biochemical 
function of alcohol dehydrogenase (ADH) is to metabolize ethanol, which is 
abundant in fermented fruits in the flies’ natural environment. Several stud- 
ies have shown that the Adh" allele has significantly higher enzymatic activ- 
ity than Adh', but is less heat-resistant, and that these and other biochemi- 
cal and physiological attributes translate into fitness differences between 
Adh genotypes under particular experimental regimes (Sampsell and Sims 
1982; van Delden 1982). In natural populations, frequencies of these two Adh 
alleles often vary locally (e.g., inside versus outside wine cellars; Hickey and 
McLean 1980) and also show strong latitudinal clines, with Adh? more com- 
mon with increasing latitude in both the Northern and Southern hemi- 
spheres (Oakeshott et al. 1982). Such evidence for diversifying selection on 
Adh implies that prima facie estimates of gene flow based on this polymor- 
phism alone could be misleadingly low. Based on several other genetic 
traits, Singh and Rhomberg (1987) concluded that gene flow in D. melan- 
ogaster is sufficiently high (Nm = 1-3), even on continental scales, to theoret- 
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ically homogenize nuclear genes in the absence of selection (but see Begun 
and Aquadro 1993; Hale and Singh 1991). On the other hand, further molec- 
ular analyses of D. melanogaster and related species have uncovered a rich 
heterogeneity of population genetic signatures, suggesting that natural 
selection, genetic drift, mutation rate, recombination rate, and other evolu- 
tionary factors must have all contributed (often interactively) to the 
observed patterns (Aquadro 1992). 

Based on similar arguments from comparative geographic patterns, nat- 
ural selection on at least some molecular markers has been implicated for 
various genetic polymorphisms in many other species as well (e.g., Ayala et 
al. 1974). A recent example involving humans entailed calculating genetic 
differentiation at more than 330 short tandem repeat (STR) loci in Africans 
and Europeans (Kayser et al. 2003). For about a dozen loci that displayed 
unusually large genotypic differences, additional linked loci were then 
genotyped, and they too showed significant genetic divergences between 
these populations. The authors concluded that these loci displaying aber- 
rant genetic distances from the genomic norm probably earmark chromoso- 
mal regions that have been under unusually intense diversifying selection 
related to environmental circumstances. 

When such findings are coupled with further lines of evidence for bal- 
ancing, directional, or diversifying selection on particular proteins (see 
Chapter 2) or DNA sequences (e.g., Hughes and Nei 1988; Kreitman 1991; 
MacRae and Anderson 1988; Nei and Hughes 1991), it becomes clear that 
interpretations of geographic population structure under the assumption of 
strict neutrality are made with some peril. At the very least, conclusions 
about genomically pervasive forces shaping population structure in any 
species should be based on information from multiple independent loci. 


Historical demographic events 


For reasons of mathematical tractability, many theoretical models in popu- 
lation genetics yield only equilibrium expectations between counteracting 
evolutionary forces, such as the diversifying influence of genetic drift in 
small populations versus the homogenizing influence of gene flow under an 
island model or a stepping-stone model (see Box 6.3). Seldom is it feasible to 
formally consider the idiosyncratic histories of particular species or to treat 
non-equilibrium situations. Yet demographic histories and phylogenies of 
real species are highly idiosyncratic and are likely to produce population 
genetic signatures that depart in various ways from theoretical equilibrium 
expectations. Hence, empirical genetic structures of natural populations are 
notoriously challenging to interpret. 

For example, in comparative analyses of three anadromous fish species 
along the same coastline transect in the southeastern United States, Bowen 
and Avise (1990) were led to consider several historical demographic and 
biogeographic factors that might have produced the observed differences in 
population structure as registered by mtDNA. All three species showed 
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significant differences in haplotype frequency between the Atlantic and the 
Gulf of Mexico, but in magnitudes and patterns that differed greatly among 
taxa. Populations of the black sea bass (Centropristis striata) showed little with- 
in-region polymorphism and a clear phylogenetic distinction between the 
Atlantic and the Gulf; menhaden (Brevoortia tyrannus and B. patronus) showed 
extensive within-region polymorphism and a paraphyletic relationship of 
Atlantic to Gulf populations; and sturgeon (Acipenser oxyrhynchus) exhibited 
extremely low mtDNA variation within and between regions. Based on the 
magnitude of mtDNA variation observed in regional populations of these 
three species, estimates of evolutionary effective population size varied by 
more than four orders of magnitude—from Ng, = 50 (Gulf of Mexico stur- 
geon) to Ngre; = 800,000 (Atlantic menhaden)—and their rank order was cor- 
related with present-day census sizes. These differences in Ng, which pre- 
sumably reflect the idiosyncratic demographic histories of the three species, 
may help to explain some of their distinctive phylogenetic features, including 
the clean distinction between Atlantic and Gulf forms of the sea bass versus 
the paraphyletic pattern in menhaden (assuming that regional populations in 
both of these species were separated by similar historical vicariant events). 
However, even grossly different effective population sizes in the biogeo- 
graphic context of shared vicariance cannot explain all the contrasting fea- 
tures of population genetic structure in these three co-distributed fish species. 
Thus, for menhaden and sturgeon (but not sea bass), recent gene flow 
between the Atlantic and Gulf is strongly implicated by the shared presence 
in these two regions of several nearly identical mtDNA haplotypes. 

Whether these particular inferences are correct or not, they serve to 
introduce some of the historical demographic considerations and non-equi- 
librium environmental conditions that must have affected genetic structures 
in real populations. In interpreting empirical data on population structure, 
deciding how far to pursue idiosyncratic demographic explanations is a dif- 
ficult challenge, particularly because these explanations can seldom be test- 
ed critically in controlled or replicated settings (however, see Fos et al. 1990; 
Scribner and Avise 1994a; Wade and McCauley 1984), and because compet- 
ing scenarios might also be compatible with the data. Nonetheless, cog- 
nizance of the limitations of equilibrium theory, and of the potential effect of 
historical demographic factors on population genetic structures, represents 
an important step toward greater realism. 


Population assignments 


Most of the molecular assessments of geographic structure and gene flow 
described above employed sample allele frequencies from composite 
assemblages of individuals—"populations"—that had been defined a pri- 
ori, typically by subjective spatial and phenotypic criteria. Any such popu- 
lation, real or not, will of course have some quantifiable genetic relation- 
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ship to others, but this summary characterization may obscure much that 
is of biological interest. In other words, an undesirable element of circular 
reasoning is introduced into traditional assessments of spatial structure 
that take particular populations as "givens" at the outset of the analysis. 
Although this may seldom be a fatal difficulty in practice, it is desirable in 
many situations to treat individual organisms (whose genetic reality or 
coherence is seldom in dispute) as basic units of genealogical analysis. 

An early example of this approach involved summarizing genotypic 
data from 30 microsatellite loci into an evolutionary phenogram whose 
external nodes were the 148 individual humans examined (Figure 6.6). 
Despite the small variation in allele frequencies between regionally defined 
populations around the world, branches connecting individuals into a 
neighbor-joining tree (based on percentages of alleles shared across loci) 
proved to reflect these people's geographic origins "with remarkable accu- 
racy" (Bowcock et al. 1994). Other informative examples of this sort, in 
which individual organisms were treated as fundamental units in popula- 
tion structure analysis (based on genotypic data from multiple nuclear loci), 
include molecular analyses of Apis honeybees (Estoup et al. 1995), 
Heterocephalus mole-rats (O'Riain et al 1996), and Odocoileus deer (Blanchong 
et al. 2002). 

A general goal in such studies is to classify particular individuals into 
populations (Davies et al. 1999; Guinand et al. 2002). One appropriate bio- 
logical context is when a number of suspected source populations may 
have contributed individuals to a sample of interest, in which case a mixed- 
stock analysis can be conducted (Waser and Strobeck 1998). Allele frequen- 
cies in candidate source populations are estimated at a series of unlinked 
loci, and the statistical likelihood (based on multi-locus genotype) that each 
individual of unknown origin came from each potential source is calculat- 
ed (Letcher and King 1999; Rannala and Mountain 1997; Smouse et al. 
1990). In a recent modification of this approach, Pritchard et al. (2000) intro- 
duced a Bayesian clustering method that attempts to assign multi-locus 
genotypes of individuals to specific populations while simultaneously esti- 
mating population allele frequencies. This method can be applied even in 
situations in which the source populations are not explicitly specified at the 
outset. The authors successfully applied this approach to humans and to an 
endangered avian species, Turdus helleri. Similar kinds of applications in 
individual-based population assignment also arise routinely in the context 
of human and wildlife molecular forensics (Campbell et al. 2003; Foreman 
et al. 1997; Paetkau et al. 1995; Roeder et al. 1998; see Chapter 9). 

As described in the next section, one context in which individuals are 
routinely treated as fundamental units of population genetic analysis is in 
the field of mtDNA-based phylogeography. In this special case, historical 
relationships registered in individuals' mtDNA sequences reflect the matri- 
lineal component of population structure. 
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Figure6.6 Neighbor-joining tree for 148 people. The tree was constructed from 

pairwise genetic distances at 30 microsatellite loci. The 148 subjects were treated in 
this analysis as individuals. Note the generally good agreement of genetic clusters 
with geographic origins. (After Bowcock et al. 1994.) 
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Phylogeography 


Phylogeography is a field of study concerned with principles and processes 
governing the geographic distributions of genealogical lineages, especially 
those within and among closely related species. In other words, the disci- 
pline focuses explicitly on historical or phylogenetic components of popula- 
tion structure (including how these may have been influenced by genetic 
drift, gene flow, natural selection, or any other evolutionary forces). In broad 
terms, phylogeography's most important contributions to biology have 
been to emphasize non-equilibrium aspects of population structure and 
microevolution, clarify the tight connections that inevitably exist between 
population demography and historical genealogy (Box 6.5), and build con- 
ceptual and empirical bridges between the formerly separate fields of tradi- 
tional population genetics and phylogenetic biology (Figure 6.7). The field 
of phylogeography was reviewed in a recent textbook (Avise 2000a) that 
included approximately 1,500 references to the literature and that in many 
respects represents a companion volume to this current edition of Molecular 
Markers. Thus, only an introductory qualitative treatment of phylogeogra- 
phy, involving a few select examples, will be presented here. 





Figure 6.7 Phylogeography serves as a bridging discipline between several tradi- 
tionally separate fields of study in the micro- and macroevolutionary sciences. 
(After Avise 2000a.) 
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BOX 6.5 Branching Processes and Coalescent Theory 





Branching process theory and coalescent theory are formal mathematical disci- 
plines that address inherent connections between population demography and 
genealogy at the intraspecific level (Griffiths and Tavaré 1997; Herbots 1997; Taib 
1997). As such, they provide conceptual frameworks for interpreting many of the 
findings of empirical phylogeography. 

The notion that genealogy and demography are intimately related can be 
introduced by the following scenario. Imagine that females in.a population pro- 
duce daughters according to a Poisson distribution with a mean (and hence vari- 
ance) equal to 1.0. Under this model, the expectation that a female contributes 
zero daughters to the next generation (or, the expected frequency of daughterless 
mothers) is £^ = 0.368 (e is the base of the natural logarithms), and her probabili- 
ties of producing n offspring (n 2 1) are given by e (1/11). Thus, the chances 
that a female contributes 1, 2, 3, 4, and 5 or more daughters are 0.368, 0.184, 
0.061, 0.015, and 0.004, respectively. These probabilities apply across a single 
organismal generation. Mathematical “generating functions" (available for sever- 
al theoretical distributions of family size, including the Poisson) can then be 
employed to recursively calculate the probability that a matriline goes extinct 
across multiple non-overlapping generations. Application of the Poisson generat- 
ing function, for example, yields cumulative probabilities of matriline extinction 
that increase asymptotically from 0.368 in the first generation to 0.981 by genera- 
tion 100. In other words, due to the turnover of lineages that inevitably accompa- 
nies reproduction, a few fortunate matrilines may proliferate at the expense of 
many others that die out along the way. ert 

Thus, with respect to matrilineal genealogy, individuals in any extant popu- 
lation invariably trace back, or ^coalesce," to common ancestors at various 
depths of times in the past. In other words, the individuals alive at any moment 
are historically connected to one another in a hierarchically branched intraspecif- 
ic genealogy, as in this hypothetical illustration: 


Organismal generations 








Indeed, the only situation in which this would not be true is if each female 
in every generation replaced herself with exactly one daughter, in which case 
there would be no lineage sorting, no hierarchical branching structure (all 
matrilines would exist as a series of parallel lines of descent through time), 
and no coalescent. Of course, families in all real populations show variances 
in contributions to the progeny pool; the larger that variance, the more rapid 
the pace of lineage sorting and the more shallow the resulting coalescent 
point (all else being equal). In a roughly stable population with N, females 
and a Poisson distribution of family sizes, the expected mean time (in genera- 
tions, G) to common matrilineal ancestry for random pairs of individuals is 
G = Np, and the expectation for the coalescent point of the entire suite of line- 
ages is G 2 2N, (Nei 1987). 

The same logic of branching processes and coalescence applies to patri- 
lines (the lineages traversed, for example, by the Y chromosome of mam- 
mals). It also applies in principle to "gene genealogies” at any autosomal 
locus, except that coalescent depths under neutrality are expected to be about 
fourfold greater (a twofold effect for biparental as opposed to uniparental 
inheritance, and another twofold effect for diploid versus haploid inheri- 
tance). Although it is beyond the scope of the current discussion, coalescent 
theory and related approaches have also been extended to populations that 
are historically dynamic in size (Harvey and Steers 1999), receive outside 
gene flow (e.g., Beerli and Felsenstein 1999, 2001; Nee et al. 1996a; Rogers 
and Harpending 1992; Wakeley and Hey 1997), and are geographically struc- 
tured into metapopulations (Bahlo and Griffiths 2000; Hey and Machado 
2003; Hudson 1998; Nei and Takahata 1993; Panneil.2003; Wakeley and 
Aliacar 2001), In general, this new theoretical framework has promoted 
recognition of the close relationships between genealogy and population 
demography that are highly germane to interpreting intraspecific gene trees 
estimated by molecular markers, most notably from mtDNA. 


History and background 


The introduction of mtDNA analyses to population genetics in the late 
1970s prompted a revolutionary shift toward historical, genealogical per- 
spectives on intraspecific population structure. Because mtDNA sequences 
evolve rapidly and show non-recombinational inheritance, they typically 
provide haplotype data that can be ordered phylogenetically within a 
species, yielding an intraspecific phylogeny (gene genealogy) interpretable 
as the matriarchal component of an organismal pedigree. Mitochondrial 
transmission in animal species constitutes the female analogue of male sur- 
name transmission in many human societies (Avise 1989b): Both sons and 
daughters inherit their mother's mtDNA genotype, which only daughters 
normally transmit to the next generation. Thus, mtDNA lineages reflect 
mutationally interrelated "female family names" of a species, and their his- 
torical dynamics can be VADER according to the types of theoretical 
models lon ; 





human societies (Lasker 1985, Lotka 1931; Box 6. 5). A thumbnail sketch of 
the history of phylogeography is presented in Box 6.6. 
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BOX 6.6 Brief Chronology of Some Key Developments 
in the History of Phylogeographic Analysis 


1974 Brown and Vinograd demonstrate how to generate restriction site maps 
for animal mtDNAs. 


1975 Watterson describes basic properties of gene genealogies, marking the 
beginnings of modern coalescent theory. 


Brown and Wright introduce mtDNA analysis to the study of the ori- 
gins and evolution of parthenogenetic taxa. 


1977 Upholt develops the first statistical method to estimate mtDNA 
sequence divergence from restriction digest data. 


1979 . Brown, George, and Wilson document rapid mtDNA evolution. 


Avise, Lansman, and colleagues present the first substaritive reports of 
mtDNA phylogeographic variation in nature. 


1980 Brown provides an initial report on human mtDNA variation. 


1983 Tajima and also Hudson initiate statistical treatments of the distinction 
between a gene tree and a population tree. 


1986 | Bermingham and Avise initiate comparative phylogeographic 
appraisals of mtDNA for multiple co-distributed species. 


1987 - Avise and colleagues coin the word “phylogeography,” define oe field, 
+ and introduce several phylogeographic hypotheses, - 


Cann and. colleagues describe global variation in human mtDNA. 


1989 Slatkin and Maddison introduce a method for estimating inter-popula- 
tion gene flow from allelic phylogenies. 


1990. Avise and Ball introduce principles of genealogical ciue asa 
component. of phylogeographic assessment. 


1992 Avise Summarizes the first extensive compilation of phylogeographic 
patterns for a regional fauna. 


1996 Edited volumes by Avise and Hamrick and by Smith and Wayne sum- , 
marize conservation roles for phylogeographic data. 


1998 A special i issue of the journal Molecular Ecology is devoted to phylo- 
geography. 
Templeton reviews statistical roles of “nested clade analysis" in phylo- 
geography (Templeton 1993, 1994, 1996; for a critical appraisal, see 
Knowles and Maddison 2002). © 


2000 ^ The first textbook on phylogeography is published, by Avise. 
2001 Molecular Ecology introduces a continuing subsection entitled 
“Phylogeography, Speciation, and Hybridization.” 


Source; Avise 1998b. 
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One of the earliest phylogeographic studies based on mtDNA data still 
serves as a useful illustration of the types of population structure that are 
frequently revealed (Avise et al. 1979b). The southeastern pocket gopher 
(Geomys pinetis) is a fossorial rodent that inhabits a three-state area in the 
southern United States. Analysis of 87 individuals from across this range by 
six restriction enzymes revealed 23 different mtDNA haplotypes, whose 
phylogenetic relationships and distributions are summarized in Figure 6.8. 
Clearly, most mtDNA genotypes in these gophers were localized geograph- 
ically, appearing at only one or a few adjacent collection sites. Furthermore, 
genetically related clones tended to be geographically contiguous or over- 
lapping, and a major gap in the matriarchal phylogeny cleanly distin- 
guished all eastern from all western populations. This principal phylogeo- 
graphic gap was also registered in the nuclear genome by shifts in frequen- 
Cies of distinctive karyotypes and protein electrophoretic alleles. 











Figure6.8 Mitochondrial DNA phylogeny for 87 pocket gophers across the 
species range in Alabama, Georgia, and Florida. Lowercase letters represent differ- 
ent mtDNA genotypes, which are connected by branches into a parsimony net- 
work that is superimposed over the geographic sources of the collections. Slashes 
across network branches reflect the number of inferred mutational steps along a 
pathway. Heavier lines encompass two distinct mtDNA clades that differ by at 
least nine mutational steps. (After Avise et al. 1979b.) 
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Population subdivision characterized by localized genealogical struc- 
ture or significant mtDNA phylogenetic gaps across a species’ range soon 
were likewise reported in a huge number of animal species: mammals rang- 
ing from voles and mice to whales (early studies by Carr et al. 1986; Cronin 
et al. 1991a; Cronin 1992; MacNeil and Strobeck 1987; Plante et al. 1989; 
Prinsloo and Robinson 1992; Riddle and Honeycutt 1990; Wada et al. 1991), 
birds ranging from sparrows to geese (Avise and Nelson 1989; Shields and 
Wilson 1987b; Van Wagner and Baker 1990; Zink 1991), reptiles ranging from 
geckos to tortoises (Densmore et al. 1989a; Lamb and Avise 1992; Lamb et al. 
1989; Moritz 1991), amphibians (e.g., Wallis and Arntzen 1989), freshwater 
and marine fishes (Avise 1987; Bermingham and Avise 1986; Crosetti et al. 
1993), insects (Hale and Singh 1987, 1991; Harrison et al. 1987), crustaceans 
(Saunders et al. 1986), echinoderms (Arndt and Smith 1998a; Williams and 
Benzie 1998), mollusks (Murray et al. 1991; O'Foighil and Smith 1996; 
Quesada et al. 1995, 1998), and many others. A few species proved to exhib- 
it little or no mtDNA phylogeographic structure across broad ranges, but 
these were the exception rather than the rule. Examples included some 
large, mobile mammals (Lehman and Wayne 1991; Lehman et al. 1991), 
some birds (Ball et al. 1988; Tegelstrém 1987a), several marine fishes 
(Arnason et al. 1992; Avise 1987), some migratory insects (Brower and Boyce 
1991), and miscellaneous other species, such as a nematode (Ostertagia 
ostertagt) that parasitizes cattle and probably was spread widely by livestock 
transport (Blouin et al. 1992). It soon became apparent for a wide array of 
species that differences in organismal vagility and environmental fragmen- 
tation (past and present) had exerted major influences on patterns of 
mtDNA phylogeographic population structure. 

One common finding is that regional assemblages of conspecific popu- 
lations often are distinguished by deep genealogical separations compared 
with the shallow separations in mtDNA genealogy normally observed with- 
in each assemblage. Such highly distinctive matrilineal clades within a 
species are sometimes provisionally referred to as "evolutionarily signifi- 
cant units" (ESUS; see Chapter 9) or as salient "intraspecific phylogroups" 
(Avise and Walker 1999). Furthermore, most species display only a small 
number of such ESUs (typically about 1-8), and they are usually spatially 
oriented in ways that make considerable sense in terms of geographic his- 
tory (such as known or suspected Pleistocene refugia and subsequent dis- 
persal routes) or taxonomy (e.g., they may agree well with traditionally 
described subspecies) Several examples are provided below, and many 
more are summarized by Avise (2000). 

Presumably, the localization of closely related mitochondrial genotypes 
and clades in most species reflects contemporary restraints on gene flow (at 
least via females), and many of the deeper genetic breaks (distinguishing 
provisional ESUs) register much longer-term historical population separa- 
tions. Such observations quickly prompted a deeper appreciation of distinc- 
tions between contemporary gene flow and historical population connectiv- 
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ity in a genealogical sense (Avise 1989a; Larson et al. 1984; Slatkin 1987). 
What follows are a few illustrations (chosen to make particular points) of 
how mtDNA analyses have added a phylogenetic dimension to perspectives 
on intraspecific population structure. 


Case studies on particular populations or species 


GREEN TURTLES ON ASCENSION ISLAND. Ascension Island, a tiny (8-km 
diameter) island situated on the mid-Atlantic ridge halfway between Brazil 
and Liberia, is a major rookery for green turtles (Chelonia mydas). From direct 
tagging studies, it was known that females that nest on Ascension otherwise 
inhabit shallow-water feeding pastures along the South American coastline. 
Thus, for each nesting episode (every 2 to 3 years for an individual), females 
embark on a 5,000-km migration to Ascension Island and back, a several- 
months-long odyssey requiring navigational feats and endurance that near- 
ly defy human comprehension. How might Ascension turtles have estab- 
lished such an unlikely migratory circuit, particularly since suitable nesting 
beaches along the South American coast are utilized by other green turtles? 
Carr and Coleman (1974) proposed a historical biogeographic scenario 
involving plate tectonics and natal homing. Under their hypothesis, the 
ancestors of Ascension Island green turtles nested on islands adjacent to 
South America in the late Cretaceous, soon after the equatorial Atlantic 
Ocean opened. Over the past 70 million years, these volcanic islands have 
been displaced from South America by seafloor spreading (at a rate of about 
2 cm per year). A population-specific instinct to migrate to present-day 
Ascension Island thus might have evolved over tens of millions of years of 
genetic isolation (at least with regard to matrilines) from other green turtle 
rookeries in the Atlantic. 

Bowen et al. (1989) critically tested the Carr-Coleman hypothesis by 
comparing mtDNA genotypes of Ascension Island nesters with those of 
other green turtles. They identified fixed or nearly fixed mtDNA differences 
between Ascension and many Atlantic rookeries, a finding consistent with 
severe restrictions on contemporary inter-rookery gene flow by females and, 
thus, supportive of the natal homing aspects of the Carr-Coleman hypothe- 
sis. However, the magnitude of mtDNA sequence divergence from several 
other Atlantic rookeries was tiny (p < 0.002; see Figure 6.5), indicating that 
any current genetic separation of the Ascension colony was initiated very 
recently, probably within the last 100,000 years at most. Indeed, the time 
elapsed may have been much less than this, because the predominant 
Ascension haplotype proved indistinguishable in available assays from a 
genotype characterizing a Brazilian rookery (Bowen et al. 1992). In any 
event, these genetic results clearly were incompatible with the temporal 
aspects of the Carr-Coleman scenario. Instead, the colonization of 
Ascension by green turtles, or at least extensive matrilineal gene flow into 
the population, was evolutionary recent. 
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SMALL MAMMALS OF AMAZONIA. To explain why the Amazon basin con- 
tains the world's richest biota, several hypotheses have been advanced: the 
refugial model, which posits that populations were sundered when their 
habitats were disjoined during cyclic expansions and contractions of forests 
during alternating wet and dry episodes of the Pleistocene (Cracraft and 
Prum 1988; Haffer 1969); ecological models, which posit that diversification 
was driven by selection pressures associated with high ecological and envi- 
ronmental heterogeneity in the region (Tuomisto et al. 1995); and the river- 
ine barrier model, which suggests that large rivers promoted genetic diver- 
gence in terrestrial organisms by blocking inter-regional gene flow (Ayres 
and Clutton-Brock 1992). 

The riverine barrier hypothesis was put to phylogeographic test in 
mtDNA analyses of more than a dozen small mammal species across large 
portions of Amazonia (da Silva and Patton 1998; Patton et al. 1994a,b, 1997; 
Peres et al. 1996). Salient phylogeographic partitions were uncovered with- 
in several species, but these genetic breaks did not correspond with the cur- 
rent positions of major rivers. Instead, highly divergent clades typically 
were observed in upstream versus downstream regions, in positions gener- 
ally demarcated by geological arches associated with Andean uplifts of the 
mid to late Tertiary. For at least some taxonomic groups, these observations 
prompted a new hypothesis for Amazonian phylogeography: that these 
ancient, quasi-isolated paleobasins may have been historical centers of 
diversification (da Silva and Patton 1998). 

Lessa et al. (2003) tested a prediction of the refugial model: that organ- 
isms originally isolated in Pleistocene refugia should have experienced 
substantial population growth when climates ameliorated and new habi- 
tats opened. Using coalescent theory (see Box 6.5) as applied to mtDNA 
sequence data for several small mammal species in western Amazonia, 
these authors uncovered no evidence for demographic expansions follow- 
ing the late Pleistocene. By contrast, pronounced and oft-concordant genet- 
ic footprints of recent population expansions were found in similar mtDNA 
analyses of several mammals occupying high latitudes in North America 
(Lessa et al. 2003). These results illustrate how historical demographic 
responses to climatic changes can be genetically tracked, and they also sug- 
gest that such responses may have varied across latitudinal gradients of 
biodiversity. 


BROWN BEARS AND ALLIES. Several genetic surveys of mtDNA in brown 
bears (Ursus arctos) have collectively spanned most of this species’ Holarctic 
range (Cronin et al. 1991b; Leonard et al. 2000; Matsuhashi et al. 2001; 
Paetkau et al. 1998; Taberlet and Bouvet 1994; Talbot and Shields 1996a,b; 
Waits et al. 1998). Results indicate the presence of about 5-6 provisional 
ESUs or phylogroups, each confined to a distinct region in North America, 
Europe, or Asia (Figure 6.9). Most likely, the species was subdivided histor- 
ically into several regional assemblages whose matrilines gradually accu- 
mulated the evident mtDNA differences. 
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Figure 6.9 Global mtDNA phylogeography in the brown bear. This depiction 
uses the American black bear as the outgroup, and also shows the matrilineal posi- 
tion of the polar bear. It is a simplified summary compiled from several references 
cited in the text. (After Avise 2004d.) 


Another interesting feature of these data is the position of the polar bear 
(Ursus maritimus) within this phylogeny (see Figure 6.9). In terms of matriar- 
chal ancestry as registered by mtDNA, polar bears appear to be closely allied 
to some brown bears in the " ABC Islands" of southeastern Alaska, thus mak- 
ing this clade a tiny subset of the broader lineage diversity within brown 
bears globally (Shields et al. 2000). In other words, brown bear matrilines 
appear to be paraphyletic with respect to those of polar bears. One possibil- 
ity for this unexpected pattern is that introgressive hybridization recently 
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transferred some mtDNA lineages from brown bears to polar bears (these 
two species can produce fertile offspring in captivity). Another possibility 
involves historical lineage sorting. Perhaps polar bears arose within the past 
few tens of thousands of years from coastal populations of brown bears, to 
which their matrilines now appear most closely related. If so, then polar 
bears possess a suite of derived morphological characteristics that may have 
evolved rapidly in response to the special selective conditions of the Arctic, 
a suggestion that has some support from fossil and other evidence (Talbot 
and Shields 1996a). Finally, the direction of evolution might have been exact- 
ly the reverse: ABC Islands brown bears may have arisen recently from a few 
polar bears that moved south. 


RED-WINGED BLACKBIRDS. An absence of dramatic phylogeographic pop- 
ulation structure can be just as interesting and informative as its presence. 
One widely distributed species that failed to display ancient subdivisions in 
matrilineal phylogeny (based on evidence from mtDNA restriction site 
assays) is the red-winged blackbird (Agelaius phoeniceus). A total of 34 dif- 
ferent mtDNA haplotypes were observed among 127 specimens collected 
from across North America, but all of these haplotypes were closely related, 
and they were not obviously partitioned geographically (Ball et al. 1988). 
Indeed, almost all of the haplotypes were related in a “starburst” pattern, 
with the most common haplotype (nearly ubiquitously distributed) at its 
core (Figure 6.10A). These findings indicate that redwing populations 
throughout most of the continent have been in strong and recent genetic 
contact. To a first approximation, the entire species can be considered a sin- 
gle, tight-knit evolutionary unit. 

Furthermore, the data could be grouped into a frequency histogram of 
pairwise mtDNA genetic distances among sampled individuals, and this in 
turn could be converted (assuming a conventional molecular clock) into a 
distribution of estimated times to shared matrilineal ancestry (Figure 6.10B). 
Such histograms, termed “mismatch distributions,” bear somewhat pre- 
dictable relationships under coalescent theory (see Box 6.5) to historical 
population demography and evolutionary effective population size (Fu 
1994a,b), in this case of females (Ng. For red-winged blackbirds, a reason- 
ably good fit of the data to coalescent expectations was obtained by assum- 
ing Nye) = 40,000 individuals. Furthermore, mild departures of this mis- 
match distribution from the theoretical expectation for a single population 
of this size were in a direction suggestive of a recent population expansion 
(Rogers and Harpending 1992). These findings make considerable biologi- 
cal sense because A. phoeniceus must have expanded its range across much 
of the continent within the last 18,000 years, following the retreat of the most 
recent Pleistocene glaciers. 


GLACIAL REFUGIA FOR HIGH-LATITUDE FISHES. Phylogeographic appraisals 
have been conducted on several species of freshwater fishes inhabiting high 
latitudes of North America and Eurasia. In many cases, the genetic foot- 
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Figure 6.10 Mitochondrial DNA patterns in a continent-wide restriction site 
survey of red-winged blackbirds. (A) Starburst phylogeny, with the most common 
and widespread haplotypes shown in black, hypothetical haplotypes (not observed) 
indicated by gray shading, and other haplotypes (all rare) indicated by open circles. 
(B) Mismatch frequency distribution, showing inferred times to shared maternal 
ancestry and estimated evolutionary effective population size for this species. 


prints of Quaternary refugia have been evident in the contemporary spatial 
distributions of mtDNA phylogroups or clades (see review in Bernatchez 
and Wilson 1998). Some of these surveys have been Holarctic in scale (e.g., 
Brunner et al. 2001). Typically, differentiated matrilineal clades in these fish- 
es appear to be regionally organized in ways that reflect postglacial disper- 
sal and sometimes secondary overlaps of distinctive phylogroups that had 
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Figure 6.11 Population frequencies of the two major mtDNA clades across the 
range of the rainbow smelt (Osmerus mordax). (After Bernatchez 1997.) 


accumulated genetic differences in allopatry. When secondary sympatry has 
been achieved, molecular markers also permit examination of reproductive 
compatibility between the divergent forms. 

One such example involves the rainbow smelt (Osmerus mordax) of 
northeastern North America. Bernatchez (1997) mtDNA-genotyped a total of 
1,290 smelt from 49 populations across the species' native range, and uncov- 
ered two highly divergent clades whose spatial distributions are summa- 
rized in Figure 6.11. Eastem populations were largely dominated by one 
mtDNA clade and western populations by the other. Furthermore, this 
genealogical dichotomy proved to be largely independent of life history 
forms, which include lake-dwelling and anadromous fish (see also Taylor 
and Bentzen 1993). Most likely, the so-called Atlantic and Acadian races had 
survived in glacial refugia along the Atlantic coastal plain and in the Grand 
Banks area, respectively. Based on paleogeographic as well as this genetic 
evidence, Bernatchez (1997) further postulated that the Atlantic race then col- 
onized northern regions about 5,000 years prior to the Acadian race, with 
both clades eventually coming into secondary contact in the St. Lawrence 
River estuary, where a suspected evolution of reproductive isolating mecha- 
nisms between the two races then ensued. All of these interpretations depart 
dramatically from the conventional (pre-molecular) biogeographic wisdom 
that all rainbow smelt populations originated from one coastal refugium. 
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in Europe, an emerging view is that various fishes (as well as other bio- 
tas; Bilton et al. 1998; Stewart and Lister 2001) survived the last glaciation 
not only in isolated refugia of the Mediterranean region (Hewitt 2000; 
Taberlet and Cheddadi 2002; Taberlet et al. 1998), but also farther north 
(Bernatchez 2001; Hanfling et al. 2002; Kotlik and Berrebi 2001). An interest- 
ing example involves Barbus freshwater fishes, especially in the Black Sea 
area. Today, the Black Sea is an inland body of salt water, fed by numerous 
large rivers draining most of Europe and connected to the Atlantic Ocean 
via the Mediterranean. Toward the end of the Pleistocene, however, it had 
become a giant freshwater lake as inflow of Mediterranean seawater was 
interrupted during the last glaciation. Then, about 7,500 years ago, marine 
conditions were reestablished in the Black Sea basin when catastrophic 
flooding by Mediterranean waters occurred. Kotlik et al. (2004) used 
mtDNA sequences to test whether Barbus in rivers surrounding the Black 
Sea might all trace to a recent common ancestor that could have inhabited 
the Black Sea basin during its freshwater phase. Results showed instead that 
highly divergent lineages now occupy different river drainages entering the 
Black Sea, indicating that multiple refugial populations probably survived 
throughout the late Pleistocene in the vicinity of this ancient lake. 


LACERTID LIZARDS ON ISLANDS. Molecular phylogeographic patterns can 
also serve as genealogical backdrops for interpreting evolutionary histories 
of organismal phenotypes. This exercise can be thought of as a microevolu- 
tionary analogue of “phylogenetic character mapping” (PCM) at intermedi- 
ate and higher taxonomic levels (discussed in Chapter 8). 

An illustration of PCM in the context of intraspecific phylogeography 
involves the Canary Island lizard (Gallotia galloti), A molecular genealogy 
for this species, based on mtDNA and other genetic data (Thorpe et al. 1993, 
1994), indicated the presence of two distinct lineages whose colonization 
histories could be rigorously hypothesized as a colonization of La Palma 
Island from North Tenerife and separate sequential colonizations of Gomera 
and Hierro islands from South Tenerife (Figure 6.12). Thorpe (1996) then 
used these historical inferences to interpret the distributions of 30 variable 
morphological features. Nine of these 30 characteristics (30%) were signifi- 
cantly associated with the molecular phylogeny. For example, blue spots on 
the foreleg and hindleg are present in the southern lineage, but absent in the 
northern lineage and in congeneric outgroup species, indicating that these 
features (thought to be employed in sexual communication) are synapo- 
morphies that probably arose once in a common ancestor of the lizards on 
Gomera and Hierro islands. Examples of phenotypic characters not clearly 
associated with phylogeny were dorsal yellow bars and gracile heads, both 
of which tend to be present in animals inhabiting wet, lush areas irrespec- 
tive of the molecular lineage to which they belong. Perhaps these characters 
are strongly selected for under these ecological circumstances, or perhaps 
there are direct dietary effects on phenotype (especially with regard to head 
and jaw size). 
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Figure 6.12 Evolutionary history of the Canary Island lizard (Gallotia galloti). 
Arrows indicate the colonization sequence of two major evolutionary lineages as 
deduced from molecular genetic evidence. Note the pronounced spots on the fore- 
leg and hindleg, which appear to be synapomorphies unique to the southern line- 
age. (After Thorpe 1996.) 


As in any such PCM exercise, this analysis of island lizards assumes 
that the molecular markers employed are reliable indicators of genomic 
history. To the extent that this might not be entirely true, interpretations of 
the histories and causal factors impinging on phenotypes could, of course, 
be compromised. 


MULLERIAN MIMICRY BUTTERFLIES. Many invertebrates show mtDNA 
phylogeographic patterns quite like those of vertebrates: that is, genealog- 
ical separations at varying evolutionary depths, and often major genetic 
breaks between regional population arrays or phylogroups. A case in point 
is the tropical butterfly species Heliconius erato, traditionally described as 
being composed of more than a dozen allopatric races, each displaying a 
unique wing coloration pattern. These wing patterns not only vary geo- 
graphically across northern South America, but they do so in parallel with 
wing-color races of a related species (H. melpomene). Both species are 
unpalatable to predators, so these butterflies collectively provide a classic 
example of Müllerian mimicry. It has long been of interest to calibrate the 
evolutionary rates and processes by which the different wing-color forms 
have arisen. 
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Toward that end, Brower (1994) assayed mtDNA sequences in H. erato 
across much of its range. More than a dozen different haplotypes were detect- 
ed, but these haplotypes grouped into two highly distinct clades that appear 
to be confined to opposite sides of the Andes Mountains, and which by molec- 
ular clock considerations separated about 1.5 to 2.0 million years ago. Within 
each phylogroup, by contrast, mtDNA sequence differences were small to 
negligible. Yet nearly identical wing-color patterns were observed within and 
between the two mtDNA phylogroups. Overall, the phylogeographic back- 
drop provided by mtDNA suggests that H. erato experienced a rather ancient 
population sundering event related to the gene flow barrier that the Andes 
provide, and that since that time there have been multiple instances of rapid 
and often convergent evolution in wing coloration patterns (Brower 1994). 


EUROPEAN TREES. Until fairly recently, phylogeographic studies in plants 
lagged far behind those in animals, due mostly to the perceived poor suit- 
ability of plant cytoplasmic genomes for such tasks. However, with the 
advent of better laboratory methods and the larger data sets to which they 
permit access, chloroplast (cp) DNA has become a powerful workhorse for 
botanical phylogeographic analyses at the intraspecific level (Petit et al. 
2001, 2003a,b; Schaal et al. 1998, 2003; D. E. Soltis et al. 1992). 

Some of the earliest and most extensive work involved detailed analy- 
ses of cpDNA variation in eight species of European white oaks (Quercus) 
sampled from more than 2,600 sites in 37 European countries (see reviews in 
Petit and Verdramin 2004; Petit et al. 2003b). Genetic footprints from the 
chloroplast genome revealed several primary and secondary Pleistocene 
refugia where genetic differentiation must have been initiated, as well as 
specific postglacial colonization routes from those isolated southern pock- 
ets. Polymorphic genetic markers from cpDNA have also helped to reveal 
hybridization patterns between various European oak species (Bacilieri et 
al. 1996; Belahbib et al 2001; Petit et al. 2002). 

Such analyses based on cpDNA sequences were then extended to 22 
widespread species of trees and shrubs (Petit et al. 2003b). This massive 
phylogeographic survey not only helped to identify primary glacial refugia 
in Europe, but also demonstrated that the most genetically diverse popula- 
tions now occur at intermediate (rather than southern) latitudes, probably 
as a consequence of genetic admixture of divergent lineages that had 
expanded outward from their ancestral homes. Thus, Pleistocene refugia in 
Europe were historical wellsprings of genetic diversity in plants, but mod- 
ern admixture zones are the current melting pots. 


FREE-LIVING MICROBES. Even some of the world's smallest creatures have 
attracted phylogeographic attention. Some of this work has been motivated 
by the "ubiquitous dispersal" hypothesis (see Finlay 2002), which posits that 
by virtue of their numerical abundance and ease of dispersal, most free-liv- 
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ing species of common microbial eukaryotes with body sizes less than about 
2 mm, as well as superabundant free-living prokaryotes that are much 
smaller, probably lack appreciable population structure across huge (even 
global) geographic scales. This suggestion, based mostly on theoretical con- 
siderations and indirect evidence such as morphology-based taxonomy and 
microbial community structure, has been controversial (Coleman 2002). 

In recent years, the ubiquitous dispersal hypothesis has been put to pre- 
liminary empirical genetic tests. In partial support of this notion is a finding 
by Darling et al. (2000) that Arctic and Antarctic subpolar populations with- 
in each of two planktonic foraminiferan species (Globigerina bulloides and 
Turborotalita quinqueloba) share at least one identical genotype at an other- 
wise variable rRNA gene, suggesting that trans-tropical gene flow has 
occurred recently. Likewise, an identical rDNA genotype has been reported 
in some flagellated protists worldwide, including at such disparate sites as 
a shallow fjord in Denmark and hydrothermal deep-sea vents in the eastern 
Pacific (Atkins et al. 2000). On the other hand, various green algal protists in 
several recognized genera, such as Pandorina and Volvulina, have proved 
upon genetic examination to consist of numerous sexually isolated groups 
(syngens) that are otherwise morphologically nearly identical (Coleman 
2000). Furthermore, within some of these surveyed syngens, genetic dis- 
tances among rDNA sequences appear to increase with geographic distance 
between collecting sites, suggesting considerable biogeographic population 
structure (Coleman 2001). Likewise, in rDNA surveys of Sulfolobus microbes 
(Archaea) that live in isolated geothermal environments, significant genetic 
differentiation has been documented among various populations scattered 
around the world, thus contradicting predictions of the unrestricted disper- 
sal hypothesis (Whitaker et al. 2003). 

One phenomenon that has complicated biogeographic reconstructions 
in some microbes, notably bacteria (Parker and Spoerke 1998; Qian et al. 
2003; Spratt and Maiden 1999), is horizontal gene transfer (see Chapter 8), 
which can create genomes with mosaic evolutionary histories and conflict- 
ing phylogeographic patterns across loci (Parker 2002). Other challenging 
difficulties include identifying particular taxa or clades to begin with 
(because the morphological evidence is often inadequate) and sampling 
them extensively enough from across vast regions of Earth to critically test 
the ubiquitous dispersal hypothesis using molecular markers (see John et al. 
2003 for an example). 


HUMAN POPULATIONS. Not surprisingly, more attention has been paid to 
phylogeography in Homo sapiens than in any other species. Among many 
early studies (e.g., Ballinger et al. 1992; Cann et al. 1984; Denaro et al. 1981; 
DiRienzo and Wilson 1991; Excoffier 1990; Hasegawa and Horai 1991; 
Johnson et al. 1983; Merriwether et al. 1991; Stoneking et al. 1986; Torroni et 
al. 1992; Vigilant et al. 1991; Ward et al. 1991; Whittam et al. 1986), two cap- 
tured the essence of the situation and stand out as having had major histor- 
ical and conceptual impacts. 
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First, an early glimpse of global mtDNA diversity came from RFLP 
analyses of 21 people of diverse racial and geographic origins (Brown 1980). 
Genetic differentiation proved to be quite limited (mean sequence diver- 
gence p = 0.004). Vastly more mitochondrial data have accumulated since 
Brown’s original study, but current estimates of mtDNA sequence diversity 
remain nearly identical to that preliminary appraisal. Thus, the overall pic- 
ture for human matrilines remains one of fairly shallow evolutionary sepa- 
rations relative to those reported among conspecific populations in most 
other species. In this regard, the mtDNA results also parallel long-standing 
findings from the nuclear genome that human populations and races are 
remarkably similar in molecular makeup, notwithstanding obvious pheno- 
typic differences in traits such as hair texture and skin color (Boyce and 
Mascie-Taylor 1996; Nei and Livshits 1990; Nei and Roychoudhury 1982). For 
example, from an early summary of the protein electrophoretic literature, Nei 
(1985) concluded that "net gene differences between the three races of man, 
Caucasoid, Negroid, and Mongoloid, are much smaller than the differences 
between individuals of the same races, but this small amount of gene differ- 
ences corresponds to a divergence time of 50,000 to 100,000 years." 

Brown (1980) also included a provocative statement in his original 
study: that the observed magnitude of mtDNA diversity "could have been 
generated from a single mating pair that existed 180-360 x 10? years ago, 
suggesting the possibility that present-day humans evolved from a small 
mitochondrially monomorphic population that existed at that time." This 
statement implied that the coalescence of extant human matrilines might 
trace to a single female (dubbed "Eve" by the popular press) within the last 
few hundred thousand years, and furthermore, that the data indicated a 
severe bottleneck in absolute human numbers (the "Garden of Eden" sce- 
nario). The latter conclusion was soon challenged with results of models 
and computer simulations of population lineage sorting as a function of his- 
torical population demography. From such gene tree theory, Avise et al. 
(1984a) concluded that "Eve could have belonged to a population of many 
thousands or tens of thousands of females, the remainder of whom left no 
descendants to the present day, due simply to the stochastic lineage extinc- 
tion associated with reproduction." Several other authors likewise pointed 
out that simply because the genealogy of mtDNA (or any other locus) is 
observed to coalesce does not necessarily imply an extreme bottleneck in 
absolute population numbers at the coalescent point (Ayala 1995; Hartl and 
Clark 1989; Latorre et al. 1986; Wilson et al. 1985). Later analyses of various 
nuclear genes in humans bolstered the notion that Homo sapiens may never 
have experienced an acute bottleneck: "Genetic variation at most loci exam- 
ined in human populations indicates that the [effective] population size has 
been = 10* for the past one Myr ... [and] population size has never dropped 
to a few individuals, even in a single generation" (Takahata 1993). 

The second of the hallmark phylogeographic studies on humans 
extended the mtDNA survey to 147 people from around the world and pro- 
duced a parsimony tree whose root traced to the African continent (Cann et 
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al. 1987). These findings led to the "out of Africa" hypothesis, stating that 
maternal lineages ancestral to modern humans originated in Africa and 
spread within the last few hundred thousand years to the rest of the world, 
replacing those of other archaic populations. This conclusion also provoked 
initial controversy. One criticism came from some paleontologists who, on 
the basis of fossil or other evidence, favored a multi-regional origin for 
humans far preceding the apparent evolutionary date of the mtDNA spread 
(e.g., Wolpoff 1989; Wolpoff et al. 1984; but see also Stringer and Andrews 
1988; Wilson et al. 1991). Another criticism came from some geneticists who 
found that the postulated African root of the molecular phylogeny was not 
strongly supported (nor refuted) when additional tree-building analyses 
were applied to the mtDNA data (Hedges et al. 1992c; Maddison 1991; 
Templeton 1992). What remained unchallenged, however, was another 
argument for an ancestral African homeland for mtDNA: that extant African 
populations house by far the highest level of mtDNA polymorphism and, 
indeed, are paraphyletic with respect to populations on other continents. 

Being based on mtDNA evidence, these early discussions about human 
origins referred to the matrilineal component of our ancestry. To illuminate 
patrilineal ancestry, and perhaps to reveal "Adam" (the "father of us all"; 
Gibbons 1991), analogous molecular studies were then conducted on the 
human Y chromosome. Genealogical analyses of several such sequences 
uncovered modest genetic variation in our species (Dorit et al. 1995; Whitfield 
et al. 1995), with results interpreted to indicate a relatively recent coalescent 
event for human patrilines in Africa (Hammer 1995; Ke et al. 2001). 

At first thought, it might be supposed that knowledge of the matrilineal 
and patrilineal components of human ancestry would complete the story, but 
this is far from true. The vast majority of any sexual species' genetic heritage 
involves nuclear loci whose alleles have been transmitted via both genders 
across the generations. Due to the vagaries of Mendelian inheritance (random 
segregation and independent assortment), nuclear genealogies inevitably dif- 
fer somewhat from gene to unlinked gene, as well as from the uniparental 
transmission pathways traversed by mtDNA and the Y chromosome (Avise 
and Wollenberg 1997). So attempts have been made to add nuclear gene 
genealogies (other than those on the Y) to analyses of human origins. Among 
published examples are studies of the X-linked ZFX gene (Huang et al. 1998) 
and of autosomal genes encoding apolipoproteins (Rapacz et al. 1991; Xiong 
et al. 1991), B-globin (Fullerton et al. 1994; Harding et al. 1997), and others 
(Ayala 1996; Tishkoff et al. 1996; Wainscoat et al. 1986). Takahata et al. (2001) 
used DNA sequence data from ten X-chromosomal regions and five autoso- 
mal regions to deduce ancestral haplotypes at each locus, and the analyses 
offered substantial support for African (rather than Asian) human genetic ori- 
gins during the Pleistocene. Findings from most loci surveyed to date are also 
generally consistent with the presence of tight genealogical connections 
among human populations worldwide (Takahata 1995). 
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Thus, most available nuclear data also tend to support a relatively 
recent out-of-Africa expansion scenario for our species (e.g., Armour et al. 
1996; Goldstein et al. 1995b; Nei and Takezaki 1996; Reich and Goldstein 
1998), while not necessarily eliminating the possibility that some relatively 
small fraction of genes may also have had early diversification centers else- 
where, such as in Asia (Giles and Ambrose 1986; Takahata et al. 2001; Xiong 
et al. 1991). One plausible scenario is that humans with modern anatomical 
features appeared first in Africa and then spread throughout the world, not 
completely replacing archaic populations, but rather interbreeding with 
them to some extent (Li and Sadler 1992). To address such issues compre- 
hensively will require secure knowledge of many more nuclear gene 
genealogies in peoples from around the world. 


Genealogical concordance 


The literature on mtDNA variation in animals (and cpDNA variation in 
plants) has demonstrated that nearly all species are likely to be genetically 
structured across geography, at least to some degree. Given that geographic 
variation can vary tremendously in magnitude and will have been affected 
by forces that operated at a wide range of ecological and evolutionary time 
frames, it is important to recognize not only the proper spatial scale, but also 
the proper temporal scale in each case. 

One empirical way to do so (and perhaps the only way from molecular 
genetic evidence) is by appealing to "genealogical concordance" principles 
(Avise and Ball 1990), which in general provide a conceptual framework for 
empirically distinguishing historically deep (ancient) from shallow (recent) 
population structures, based on levels of agreement between independent 
genetic characters or data sets. For heuristic purposes, genealogical concor- 
dance has four distinct aspects. These aspects are diagrammed in Figure 6.13 
and will be described in tum with a few examples provided of each. 


ACROSS MULTIPLE SEQUENCE CHARACTERS WITHIN A GENE. Almost by def- 
inition, any deep phylogenetic split deduced in a gene genealogy will have 
been registered concordantly by multiple independent sequence changes 
within the molecule. For example, the evolutionary separation between 
eastern and western matrilines in southeastern pocket gophers (see Figure 
6.8) was deemed to be relatively ancient precisely because at least nine inde- 
pendent restriction profiles agreed perfectly in delineating these matrilineal 
clades, whereas in these same molecular assays, haplotypes within either 
the eastern or the western clade usually differed by no more than two such 
changes. Numerous species have likewise proved to consist of geographic 
sets of populations that differ from other such groups by many more muta- 
tional steps (i.e., display higher sequence divergence) than typically occur 
within a geographic region. 
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Figure 6.13 Schematic description of four distinct aspects of genealogical 
concordance. A and B are distinctive phylogroups in a gene tree. (After Avise 2000a.) 
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The quantitative significance of genealogical concordance aspect I is 
that bootstrapping or related statistical criteria permit the recognition of 
putative clades in a gene tree only when multiple characters consistently 
distinguish what therefore become well-supported clades. In theory, at 
least three or four diagnostic genetic characters, uncompromised by homo- 
plasy, are required for robust statistical recognition of a putative gene tree 
clade. Empirically, many more nucleotide substitutions than that often 
cleanly distinguish regional sets of populations in phylogeographic sur- 
veys of mtDNA. The biological significance of aspect I is that appreciable 
evolutionary time must normally have elapsed for multiple independent 
mutations to accumulate between lineages within a non-recombining gene 
tree. Furthermore, under neutrality theory and molecular clock concepts, 
the greater the magnitude of sequence divergence, the greater the time 
elapsed (all else being equal). 


ACROSS GENE GENEALOGIES WITHIN A SPECIES. In principle, phylogeo- 
graphic breaks in a gene tree can arise not only from long-term vicariant sep- 
arations, but also from isolation by distance in continuously distributed 
species (Irwin 2002; Neige! and Avise 1993). To distinguish gene-idiosyncrat- 
ic or spatially haphazard genealogical breaks (due to isolation by distance, or 
perhaps to gene-specific selection) from ancient vicariance-induced 
genealogical breaks (whose effects are more likely to be genomically exten- 
sive), aspect II of genealogical concordance should be addressed. 

Suppose that within a given species, gene genealogies have been esti- 
mated not only for mtDNA or cpDNA, but also for each of multiple 
unlinked nuclear loci within each of which inter-allelic recombination had 
been rare or absent over the time frame under scrutiny. Suppose further 
that each of those gene trees displayed a deep genealogical split (as defined 
by aspect I of genealogical concordance), and that those splits agreed well 
or perfectly with respect to the particular populations distinguished. This 
is what is meant by aspect II of genealogical concordance. Its biological sig- 
nificance is that these concordant partitions across independent gene trees 
within an organismal pedigree almost certainly register a fundamental (i.e., 
genomically pervasive) phylogenetic split at the population level. In other 
words, extant populations that concordantly occupy different major 
branches in multiple gene trees probably separated from one another long 
ago. 

As discussed in earlier chapters, several technical as well as biological 
complications have conspired to inhibit molecular appraisals of nuclear 
gene trees at intraspecific levels, but a few informative cases do exist (Hare 
2001). One of the earliest and most interesting involved the killifish 
Fundulus heteroclitus, an inhabitant of salt marshes along the eastern 
seaboard of the United States. Near the midpoint of this coastline, two com- 
mon alleles at a lactate dehydrogenase (LDH) nuclear gene proved to 





304 Chapter6 















(A) (B) 
Northern 
populations 
1.0 
4 0.8 Fundulus 
a heteroclitus 
d ao 
z 0.6 
B | Southern 
- populations 
> £ 
3 
Ea 
E: 0.2 Á 
0.0 à e—a 
44 42 40 38 36 34 32 30 2.0 1.0 0.0 
Degrees north latitude Percent sequence 
divergence (mtDNA) 


Figure 6.14 Molecular geographic patterns in the killifish Fundulus heteroclitus. 
(A) Latitudinal cline in population frequencies of the b allele in an LDH nuclear 
polymorphism. (B) Phenogram of mtDNA haplotypes in the same populations, 
showing a deep phylogenetic distinction between northern and southern areas (a 
similar phylogeographic pattern was observed in a gene tree of LDH haplotypes). 
(After Powers et al. 1991a.) 


exhibit a pronounced clinal shift in frequency (Figure 6.14A). Detailed lab- 
oratory studies revealed kinetic and biochemical differences between these 
LDH alleles that predicted significant differences among individuals in 
metabolism, oxygen transport, swimming performance, developmental 
rate, and relative fitness (Powers et al. 1991a; Schulte et al. 1997). The nature 
of these differences was such that latitudinal shifts in environmental tem- 
perature were posited as directly responsible for the clinal allelic structure 
(Mitton and Koehn 1975; Powers et al. 1986). Does contemporary adapta- 
tion to local ecological conditions provide the entire story for the genetic 
architecture of these killifish populations? 

Researchers then generated an mtDNA gene tree (González-Villasehor 
and Powers 1990) as well as a sequence-based LDH gene tree (Bernardi et al. 
1993) for killifish populations sampled along the same coastal transect. 
These trees demonstrated a pronounced phylogenetic subdivision of F. hete- 
roclitus into northern versus southern matrilineal clades (Figure 6.14), Thus, 
northern and southern populations were probably isolated from each other 
during the Pleistocene and now hybridize secondarily along the mid- 
Atlantic coast in such a way as to contribute to the clinal structure observed 
in LDH allele frequencies and in some other nuclear genes. This example 
demonstrates two points: that genealogical concordance across independent 
loci (aspect II) can provide empirical support for historical vicariance at the 
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population level, and that phylogenetic and selective mechanisms need not 
be opposing influences and may in some cases act in concert to achieve an 
observed population structure (Powers et al. 1991b). Although the differ- 
ences in LDH allele frequency are mediated to a significant degree by envi- 
ronmental selection, the historical context in which this selection had taken 
| place added an important dimension to knowledge on contemporary popu- 
f lation genetic structure in killifish. 
í Only a small number of studies to date have explicitly searched for intra- 
i specific genealogical concordance across multiple unlinked loci (introns 
| and/or exons). In a fungal taxon, Coccidioides immitis, Koufopanou et al. (1997) 
showed strong phylogeographic agreement across five loci that partitioned 
California from non-California populations (which might therefore be cryptic 
species). In the European grasshopper Chorthippus parallelus, concordant pat- 
| terns in allozymes, nuclear DNA sequences, and other characters generally 
f distinguish parapatric subspecies (Cooper and Hewitt 1993; Cooper et al. 
1995). In the tide pool copepod Tigriopus californicus, Burton (1998) found gen- 
eral agreement between nDNA and mtDNA with regard to a deep phylogeo- 
graphic partition. On the other hand, Palumbi and Baker (1994) found sharply 
j contrasting phylogeographic structures for nuclear intron sequences and 
| mtDNA in humpback whales (Megaptera novaeangliae). The discrepancy in this 
case probably reflects either differences in genetic drift in mitochon-drial ver- 
| sus nuclear genes (due to their expected difference in effective population 
| size) or asymmetric dispersal by sex, in which males transferred nuclear DNA 
| (but not mtDNA) to offspring they sired in foreign matrilineal groups. 
| In phylogeographic surveys of the European rabbit (Oryctolagus cunicu- 
lus) across the Iberian Peninsula, a well-defined genealogical break in mtDNA 
(Branco et al. 2002) is spatially concordant with patterns in nuclear protein 
| and immunological polymorphisms, but is not registered by major shifts in 
microsatellite allele frequencies (Queney et al. 2001). The authors concluded 
that microsatellite loci mutate so rapidly as to produce extensive homoplasy, 
which obscured what apparently was a relatively ancient (2-million-year-old) 
population separation. In general, rapidly evolving microsatellite loci may be 
better suited for revealing recent or shallow population structures than for 
confirming deep historical structures that may be registered in other classes of 
molecular markers (Gibbs et al. 2000b; Mank and Avise 2003). 
Because nuclear gene trees are difficult to obtain at the intraspecific level, 
a surrogate approach that falls within a broader conceptual framework of con- 
cordance aspect I is to examine other kinds of nuclear genetic evidence for 
possible phylogeographic agreement with a cytoplasmic gene tree. For exam- 
ple, in the North American sharp-tailed sparrow (Ammodramus caudacutus), a 
rather deep split in matrilineal genealogy (as registered in mtDNA) distin- 
guished an assemblage of birds currently inhabiting the continental interior as 
well as the northern Atlantic coast from a southern group that occurs along 
the Atlantic coast from southern Maine to Virginia (Rising and Avise 1993). 
Populations belonging to these two mtDNA clades also proved to be concor- 
dantly recognizable in multivariate analyses of morphology, song, and flight 
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displays (Greenlaw 1993; Montagna 1942). Such concordance among inde- 
pendent lines of (presumably genetic) evidence indicates that the split in the 
mtDNA gene tree also reflects a rather deep phylogenetic distinction in organ- 
ismal phylogeny. Indeed, these two sets of populations were later elevated to 
the status of separate taxonomic species. 

In other empirical cases, phylogeographic breaks in an mtDNA gene 
tree do not seem to coincide with sudden changes in other organismal traits 
(e.g., Bond et al. 2001; Irwin et al. 2001a,b; Puorto et al. 2001). As discussed 
earlier in this chapter, possible reasons for such discrepancies are many and 
must be scrutinized on a case-by-case basis. 


ACROSS CO-DISTRIBUTED SPECIES. Suppose now that several species with 
similar geographic ranges have been genetically surveyed, that several or all 
of them display deep phylogeographic structure (as evidenced by aspects I 
or II of genealogical concordance, as described above), and that the phylo- 
geographic partitions are at least roughly similar in spatial placement and 
perhaps temporal depth. Aspect III of genealogical concordance would then 
have been documented. The biological significance of aspect III is that it 
strongly implicates historical biogeographic factors as having shaped the 
genetic architectures of multiple species in similar fashion. Studies conduct- 
ed under this multi-species orientation exemplify what has been termed the 
"regional" (Avise 1996), "landscape" (Templeton and Georgiadis 1996), or 
"comparative" (Bermingham and Moritz 1998) approach to phylogeography. 

The first extensive phylogeographic appraisals of a regional biota 
involved numerous freshwater and maritime species in the southeastern 
United States (see reviews in Avise 1992, 1996, 2000a). In both of these envi- 
ronmental realms, a remarkable degree of aspect III phylogeographic con- 
cordance was evidenced in molecular genetic surveys. For example, within 
each of several freshwater fishes, including Micropterus salmoides bass, Amia 
calva bowfins, Gambusia mosquitofish, and each of four species of Lepomis 
sunfish, deep phylogeographic partitions in the mtDNA molecule typically 
distinguished populations from most river drainages entering the Gulf of 
Mexico from those inhabiting most watersheds of the Atlantic Ocean 
(Figures 6.15 and 6.16). Some of these species have also been surveyed for 


Figure 6.15 Relationships among mtDNA haplotypes in seven species of fresh- > 
water fish surveyed across the southeastern United States. Data are all plotted on 

the same scale of estimated sequence divergence. Data for Lepomis and Amia are 

from Bermingham and Avise (1986) and Avise et al. (1984b), those for Gambusia are 
from Scribner and Avise (1993), and those for Micropterus are from Nedbal and 
Philipp (1994). (For some of these taxa, populations were considered conspecific at 
the time of the original assays but have since been taxonomically subdivided into 
eastern and western sister species. For example, the largemouth bass was split into 
M. salmoides and M. floridanus; Kassler et al. 2002.) (After Avise 2000a.) 
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Figure 6.16 Geographic distributions of major mtDNA clades within each of the 
seven freshwater fish species described in Figure 6.15. For each species, pie dia- 
grams summarize observed frequencies of the two fundamental clades at various 
geographic sites. (After Avise 2000a.) 
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nuclear allozyme markers, for which comparable genetic breaks between 
these “eastern” and “western” forms (as well as hybridization between 
them in secondary contact zones) have been documented (Avise and Smith 
1974; Philipp et al. 1983b; Scribner and Avise 1993a; Wooten and Lydeard 
1990). Similar (albeit more complicated) phylogeographic patterns in 
mtDNA also characterize several freshwater turtle species surveyed 
throughout this same geographic area (Roman et al. 1999; Walker et al. 1995, 
1997, 1998a,b; see review in Walker and Avise 1998). 

Likewise, in the maritime realm of the southeastern United States, 
major and concordant genetic subdivisions have been reported in a wide 
variety of vertebrate and invertebrate species (Figure 6.17). Thus, appar- 
ently deep historical partitions, as registered in mtDNA or various nuclear 











Figure 6.17 Geographic distributions of primary genetic subdivisions observed 
within each of nine maritime taxa of the southeastern United States. Pie diagrams 
follow the format described in Figure 6.16. (After Avise 2000a.) 
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assays or both, tend to characterize most surveyed Atlantic versus most 
Gulf populations of Limulus horseshoe crabs (Saunders et al. 1986), 
Crassostrea oysters (Hare and Avise 1996, 1998; Reeb and Avise 1990), 
Cicindela beetles (Vogler and DeSalle 1993, 1994), Ammodramus sparrows 
(Avise and Nelson 1989), Malaclemys terrapins (Lamb and Avise 1992), 
Geukensia mussels (Sarver et al. 1992), Opsanus toadfish (Avise et al. 1987b), 
Centropristis sea bass (Bowen and Avise 1990), and Fundulus killifish 
(Duggins et al. 1995), among others. Although this phylogeographic pat- 
tern is far from universal among maritime species that have been geneti- 
cally analyzed from this region (Avise 2000a; Gold and Richardson 1998), it 
nonetheless is prevalent enough to suggest that historical biogeographic 
factors (see below) have affected the genetic architecture of a significant 
fraction of this regional biota. 

This general kind of multi-species concordance in phylogeographic pat- 
tern is not unique to faunas in the southeastern United States. Several com- 
parative molecular surveys are now available for other regional biotas 
around the world. These surveys sometimes have (and sometimes have not) 
documented aspect III concordance to varying degrees. For example, con- 
cordant intraspecific phylogeographic patterns across multiple species have 


Organisms 








Marine vertebrates 
and invertebrates 


Marine invertebrates, 
mostly 

Amphipods and 
other crustaceans 


Butterflies and 
vertebrates 


Butterfly fishes 
(Chaetodon) 


Darter fishes 

Freshwater fishes 

Freshwater and 
anadromous fishes 


Herpetofauna 


Lizards 








Concordant partitions but apparently varying temporal depths 
between geminate taxa across the Isthmus of Panama 

Four distinct phylogeographic patterns, each observed in several 
species, documenting colonization histories of trans-Arctic taxa 

Concordant subdivision of each of six species into two units, one 
inhabiting the Black Sea and the other the Caspian Sea region 

Considerable congruence of phylogeographic patterns in diverse 
invertebrate and vertebrate taxa of Amazonia 

Striking phylogeographic concordance between two species groups 
in the Indo-West Pacific marine reaim 

No appreciable phylogeographic concordance among five species 
in highlands of the south-central United States 

Species-idiosyncratic patterns suggesting perhaps three distinct 
waves of invasion into Central America from South America 

Consistently deep phylogeographic partitions in species from 
non-glaciated regions compared with more northern taxa 


No appreciable concordance in phylogeographic patterns across 
species in the North American desert southwest 


Ancient and concordant fragmentation of clades on either side of 
Wallace’s Biogeographic Line in southeastern Asia 
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been identified for elements of both the herpetofauna and the avifauna in 
rainforest remnants of northeastern Australia (Joseph and Moritz 1994; 
Joseph et al. 1995; Moritz and Faith 1998; Schneider et al. 1998) and for dis- 
junct conspecific populations of eight marine fish species in the northern 
Gulf of California versus the outer Pacific coast (Bernardi et al. 2003). Table 
6.5 summarizes several additional examples of comparative molecular phy- 
logeography as applied to regional faunas and floras. 


BETWEEN GENEALOGICAL AND OTHER BIOGEOGRAPHIC INFORMATION. A 
final aspect of genealogical concordance is between molecular genetic data 
and traditional biogeographic evidence based on nonmolecular data. Such 
concordance may apply to individual species if, for example, particular pop- 
ulations that are cleanly demarcated in a molecular genealogy correspond to 
those also recognized (perhaps in taxonomic summaries) from morphologi- 
cal, behavioral, geological, or other more traditional lines of evidence. Many 
such examples are reviewed by Avise (2000a). Or, aspect IV concordance can 
broadly refer to agreement between molecular phylogeographic patterns for 
a regional biota and comparable patterns registered in more conventional 
biogeographic appraisals. 


ta 1 4 x 


Primary reference or review 





Bermingham et al. 1997; Bermingham and Lessios 1993; Knowlton et al. 1993 
Cunningham and Collins 1998 

Cristescu et al. 2003 

Hall and Harvey 2002 

McMillan and Palumbi 1995 

Turner et al. 1996 

Bermingham and Martin 1998 

Bernatchez and Wilson 1998 

Lamb et al. 1989, 1992 


Schulte et al. 2003 
(continued) 
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TABLE 6.5 (continued) Examples and outcomes of comparative phylogeography studies 


Organisms 

Pelagic seabirds and 
marine turtles 

Birds 

Birds 

Birds 

Birds 

Mammals, birds, 
reptiles, amphibians 

Vertebrates, arthropods, 
plants 

Cats (Leopardus) 

Bats and other mammals 

Monkeys, toads 

Lemurs (primates) 


Various plants 


Plants and animals 








Outcome and region 





Similar phylogeographic partitions of rookeries on a global scale, 
but contrasting patterns within ocean basins 

In 7 of 13 species, rather consistent distinctions between populations 
on alternate sides of Beringia 

No appreciable concordance in phylogeographic patterns in species 
with transcontinental ranges 

Species-idiosyncratic phylogeographic patterns in species of the 
Caribbean islands 

Concordant genetic evidence from two African species for three 
biogeographic regions in area of Cameroon, western Africa 

Otherwise cryptic but often concordant breaks distinguishing 
phylogeographic units in diverse Baja California species 

Little concordance in phylogeographic patterns, but strong similarities 
in postglacial colonization routes across Europe 

Remarkably concordant phylogeographic partitions in ocelot and 
margay cats across Central and South America 

Limited population structure of mtDNA lineages in Neotropical bats 
contrasts with strong structure in small non-volant mammals 

Concordant areas of endemism documented by molecular markers 
for macaques (Macaca) and toads (Bufo) on Sulawesi 

Madagascar's landscape features that acted as phylogeographic 
barriers revealed by mtDNA sequences of several lemur species 

Strong concordance in major molecular phylogeographic clades 
across several plant species in the Pacific Northwest 

Congruent patterns of genetic diversity and biodiversity hotspots 
revealed for diverse biotas in the California Floristic Province 





Note: In each case, multiple co-distributed species were surveyed, using molecular markers (typically mtDNA 
in animals, cpDNA in plants), for genealogical patterns on a regional scale. 


For example, in the maritime realm of the southeastern United States, 
biogeographers have long recognized the existence of two distinct faunal 
assemblages (temperate versus subtropical) that meet along the east-central 
Floridian coastline in the general area of Cape Canaveral (Briggs 1958, 1974). 
In other words, the southern range limit of many temperate species occurs 
in this transition zone, as does the northern range limit of many species 
adapted to warmer waters. For several other species that are continuously 
distributed across this transition zone, molecular data have revealed other- 
wise cryptic genealogical breaks in this same region (see Figures 6.4 and 
6.17). Thus, there is a general spatial agreement between major biogeo- 
graphic provinces, as defined by traditional faunal lists, and major phylo- 


geographic subdivisions often registered within species. 
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Primary reference or review 


Avise et al. 2000 
Zink et aJ. 1995 
Zink 1996 
Bermingham et al. 1996 
Smith et al. 2000 
Riddle et ai. 2000 
Taberlet et al. 1998 
Eizirik et al. 1998 
Ditchfield 2000 
Evans et al. 2003 
Pastorini et al. 2003 
Soltis et al. 1997 


Calsbeek et al. 2003 





These findings, if they can be generalized, indicate that biogeographic 
provinces and the boundaries between them may often have been shaped 
by a combination of historical vicariance and contemporary selection. In the 
southeastern United States, Pleistocene or earlier events probably separated 
populations periodically into Gulf versus Atlantic zones, where adaptations 
to local conditions also arose. Populations in one or the other region then 
sometimes went extinct, in which case only their sister taxa in the other 
region remained present for observation today (thus contributing to the dis- 
tinctness of faunal provinces based on species lists). For species whose pop- 
ulations survive in both regions, genetic footprints of the sundering events 
are often retained in extant genomes. Today, these distinctive forms now 
characterize the Gulf and Atlantic regions, and they often meet and mix in 
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boundary zones, which therefore exist where they do in part for historical 
reasons and in part for reasons related to contemporary selection gradients 
and associated gene flow barriers. 

A similar spatial agreement (between traditionally recognized biogeo- 
graphic provinces and concentrations of intraspecific phylogroups) exists 
for freshwater fishes of the southeastern United States. Based on the known 
geographic ranges of 241 fish species in the area, Swift et al. (1985) estimat- 
ed and then clustered faunal similarity coefficients for the region's approxi- 
mately 30 major river drainages. The deepest split in this faunal similarity 
phenogram cleanly distinguished Gulf coast drainages from those along the 
Atlantic coast and in peninsular Florida, thereby defining two major biotic 
provinces that agree quite closely with intraspecific phylogeographic pat- 
terns often registered in molecular appraisals (Figure 6.18). Again, historical 
as well as contemporary factors must have operated to shape these concor- 
dant regional features of the biotic and genetic landscape. 

Another example of aspect IV concerns European biotas. Extensive 
molecular appraisals of many animal and plant species across Europe have 
revealed both species-specific idiosyncrasies and generalized trends in phy- 
logeographic patterns (see reviews in Hewitt 1999, 2000; Petit et al. 2003b; 
Petit and Verdramin 2004; Taberlet et al. 1998). Among the latter, most 
noticeable are genetic as well as traditional biogeographic data document- 
ing that a small number of Pleistocene refugia (notably in the Balkans, the 
Iberian Peninsula, and the Apennine or Italian Peninsula) were major cen- 
ters of historical genetic isolation and were the primary foci from which 
postglacial recolonizations of Europe often took place along specifiable and 
rather consistent routes. 


Genealogical discordance 


The flip side of genealogical concordance is the lack of phylogeographic 
agreement across various genetic characters or data sets. Genealogical dis- 
cordance likewise has four distinct aspects, and these also can be highly 
informative when uncovered in particular instances. 

Discordance aspect I occurs when different sequence characters within 
a gene suggest conflicting or overlapping clades in the gene tree. If the locus 
was historically free of inter-allelic recombination (as is normally true for 
mtDNA), then homoplasy (evolutionary "noise" involving convergences, 
parallelisms, or reversals in some character states) must be responsible. If 
the locus is housed in the nucleus, and if the nucleotide differences that dis- 
play overt phylogenetic discordance are clustered in distinct portions of a 
gene sequence, then inter-allelic recombination may have been responsible 
(see Chapter 4). Alleles that arose via intragenic recombination consist of 
amalgamated stretches of sequence that truly had different genealogical his- 
tories within the locus. 

Aspect II discordance involves disagreement across genes in phylogeo- 
graphic signatures within a given species. Some degree of phylogenetic dis- 
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Figure 6.18 Aspect IV genealogical concordance illustrated by fishes of the 
southeastern United States. Left, phenogram summarizing faunal relationships 
among 31 river drainages, based on faunal similarity coefficients from compilations 
of range data from 241 fish species (data from Swift et al. 1985). Right, geographic 
distributions of the two major branches in an intraspecific matrilineal genealogy 
for the spotted sunfish, Lepomis punctatus (data from Bermingham and Avise 1986). 
(After Avise 2000a.) 


cordance across gene trees is an inevitable consequence of Mendelian inher- 
itance and the vagaries of lineage sorting at unlinked loci through a sexual 
pedigree. As described earlier, however, additional biological factors can 
also produce genealogical heterogeneity among loci. Two such factors apply 
with special force to comparisons between nuclear and mtDNA (or cpDNA) 
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genealogies: the expected fourfold difference (all else being equal) in effec- 
tive population sizes for genomes housed in the nucleus versus the cyto- 
plasm; and behavioral differences between the genders, which can produce 
contrasting phylogeographic patterns at cytoplasmic versus nuclear loci if, 
for example, one sex or its gametes are far more dispersive than the other 
(see above). Finally, dramatic heterogeneity among gene trees within an 
extended pedigree can characterize any sets of molecular markers if some 
but not all of them have been under intense forms of balancing or diversi- 
fying selection. 

Aspects III and IV of phylogeographic discordance simply suggest that 
whatever ecological or evolutionary forces may have been at play in a given 
instance have operated mostly in a species-idiosyncratic fashion, rather than 
in a generic way that might otherwise have concordantly shaped the genet- 
ic architectures of a regional biota. Even when each species proves to have a 
unique or peculiar phylogeographic history, this finding can be of consider- 
able interest, for example, in developing conservation plans for particular 
rare or endangered species (see Chapter 9). 


Microtemporal Phylogeny 


Most nucleic acids evolve far too slowly to permit direct detection of signifi- 
cant de novo sequence evolution over yearly or decade-long scales. One 
valiant attempt to describe such microtemporal changes involved compar- 
isons of mtDNA sequences in modem versus earlier populations of a kanga- 
roo rat (Dipodomys panamintinus) sampled at three locales in California 
(Thomas et al. 1990). Sequences from extant specimens were compared with 
PCR-amplified sequences from dried museum skins prepared in 1911, 1917, 
and 1937. Results indicated temporal stability, with the three populations 
Showing identical genetic relationships in the early- and late-twentieth-cen- 
tury collections. However, even if a dramatic population genetic change had 
been observed, it presumably-would have entailed frequency shifts of preex- 
isting haplotypes (as can occur rapidly by random genetic drift, differential 
reproduction, or migration of foreign lineages into the site), rather than the 
in situ origin and spread of de novo mutations over such a short period of 
time. A recent genetic survey of white-footed mice (Peromyscus leucopus) in 
the Chicago area compared mtDNA haplotypes of nineteenth-century muse- 
um skins with those present in modern samples, and it did indeed reveal 
such allele frequency changes at the population level (Pergams et al. 2003). 
Exceptional molecular systems do exist that mutate so rapidly as to per- 
mit de novo sequence evolution to be documented and monitored in con- 
temporary time (Jenkins et al. 2002). These systems are RNA viruses such as 
HIVs, the human immunodeficiency retroviruses responsible for AIDS 
(acquired immune deficiency syndrome). The mean rate of synonymous 
nucleotide substitution for HIV genomes is approximately 10? per site per 
year (W.-H. Li et al. 1988), or about a million times greater than typical rates 
in the nuclear genomes of most higher organisms (see Chapter 4). High 
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mutation rates in particular regions of RNA viruses, combined with occa- 
sional inter-strain recombination, underlie the astonishingly rapid changes 
often observed in viral pathogenicity and antigenicity (Coffin et al. 1986; 
Gallo 1987). 

The HIV viruses come in two distinct classes (HIV-1 and HIV-2) that 
emerged in Africa (Diamond 1992), probably within the past few decades 
following natural (or perhaps unnatural; see Marx et al. 2001; Poinar et al. 
2001a; Weiss 2001) transfers from related “simian” viruses (SIVs) that infect 
wild primates. Judging from recent phylogenetic reconstructions based on 
nucleotide sequences (Figure 6.19A), HIV-1 and HIV-2 originated when 
distinctive SIVs from chimpanzees and sooty mangabeys, respectively, 
jumped into humans, perhaps on more than one occasion each (Bushman 
2002; Gao et al. 1999; Hahn et al. 2000; Korber et al. 2000; Lemey et al. 2003; 
T. Zhu et al. 1998). 

Beginning in the 1980s, phylogenetic analyses of HIV sequences had 
already helped to document the histories of the viral lineages that spread 
the AIDS pandemic to millions of people worldwide (Desai et al 1986; Gallo 
1987; Yokoyama and Gojobori 1987). For example, sequences analyzed from 
15 HIV isolates from the United States, Haiti, and Zaire were the basis for 
the phylogenetic appraisal summarized in Figure 6.19B. Results provided 
some of the earliest genetic support for HIV’s African origins and the tim- 
ing of the virus’s subsequent expansion to Haiti and the United States. The 
most remarkable aspect of this viral phylogeny is its short time frame; vari- 
ous branching events date only to the 1960s and 1970s. Based on similar 
phylogenetic analyses, the whole HIV-1 pandemic traces to a common 
ancestral viral sequence from about 1931 (Korber et al. 2000). 

Historical reconstructions based on molecular data have likewise been 
accomplished for other disease-causing RNA viruses, a case in point being 
the dengue virus (Flavivirus sp.). Dengue is an emerging tropical disease 
now affecting more than 50 million people. A phylogeny based on 
nucleotide sequences indicates that the virus arose approximately 1,000 
years ago, that its transfer from monkeys to humans led to sustained human 
transmission beginning about one to three centuries ago, and that current 
global diversity in the virus involves four or five primary lineages (Twiddy 
et al. 2003). 

Another consequence of rapid sequence evolution in RNA viruses is 
that different people (and sometimes even the same individual at different 
times; Holmes et al. 1992) may carry recognizable variants of the virus, a 
finding with forensic ramifications. For example, in comparisons of HIV-1 
sequences from a Florida dentist, seven of his infected patients, and 35 other 
local HIV-1 carriers, it was shown that the dentist’s particular viral strain 
was genealogically allied to those of five of his clients (Ou et al. 1992). These 
molecular findings were interpreted to provide the first genetic confirma- 
tion of HIV transmission (unintentional) from an infected health care pro- 
fessional to clients. Another such case of HIV transmission, documented by 
molecular markers, involved criminal intent (Metzker et al. 2002). 
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* Figure 6.19 Molecular microphylogenies for HIV isolates. (A) Phylogeny based 
on protein sequences from the Pol gene in various HIV strains and in SIVs from 
several primate species. Note the close relationship of HIV-1 to SIVcpz (from chim- 
panzees, Pan troglodytes) and of HIV-2 to SIVsm (from sooty mangabeys, Cercocebus 
atys). Other primates whose SIVs were analyzed include Cercopithecus lhoesti 
(Ihoest), C. solatus (sun), C. albogularis (syk), Mandrillus sphinx (mnd), and 
Chlorocebus sp. (agm). (After Hahn et al. 2000.) (B) Phylogeny, based on several 
sequenced regions of HIV genomes, indicating one early route in the spread of 
AIDS from Africa to the New World. (After W.-H. Li et al. 1988.) 


SUMMARY 


1. All conspecific individuals are genetically related through a time-extended 
pedigree (mating partners and parent-offspring links) that constitutes the full 
intraspecific genealogy of a species. Molecular markers can help to recover 
various components of this extended pedigree. 


MN 


. Molecular approaches for assessment of kinship within a species normally 
require highly polymorphic, qualitative genetic markers with known transmis- 
sion patterns. However, the complexity of potential transmission pathways 
between relatives more distant than parents and offspring, coupled with the 
relatively narrow range of potential kinship coefficients (0.00-0.25), means that 
distinguishing precise categories of genetic relationship for specific pairs of 
individuals can be accomplished only in rather special cases. On the other 
hand, assessments of mean genetic relatedness within groups are readily con- 
ducted. 


3. In eusocial species, such as many haplodiploid hymenopterans, estimates of 
mean intra-colony genetic relatedness sometimes have proved to be high, but 
numerous exceptions present a conundrum for some sociobiological theories 
on the evolution of reproductive altruism. 


4. Within non-eusocial groups, a variety of mean genetic relatedness values have 
been observed using molecular markers, and these values are often inter- 
pretable in terms of the suspected behaviors and natural histories of the partic- 
ular species assayed. Genetic markers have also helped to address questions 
regarding mechanisms and genetic consequences of kin recognition. 


5. Populations of almost all species are genetically structured across geography. 
These genetic architectures have been characterized for numerous species 
using molecular markers, and they clearly can be influenced by ecological and 
evolutionary factors operating over a wide variety of spatial and temporal 
scales. Among these influences are mating systems and gene flow regimes, 
which in turn can be influenced by the species-specific dispersal capabilities of 
gametes and zygotes, by the behaviors of organisms (including their vagility 
and social cohesiveness) and by the physical structure of the environment. 


6. The contemporary genetic architecture of any species will also have been influ- 
enced by biogeographic and demographic factors of the past. In large measure, 
historical perspectives on population genetic structure were stimulated by the 
extended genealogical reconstructions made possible by molecular assays for 
non-recombining haplotypes in cytoplasmic genomes (especially animal 
mtDNA). A relatively new discipline termed phylogeography has enriched 
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biogeographic analyses and provided an empirical and conceptual bridge 
between the formerly independent fields of traditional population genetics 
and phylogenetic biology. 


. Some viruses evolve so rapidly that genetic changes can be directly observed 
across time frames of years or decades. For example, molecular phylogenetic 
appraisals have revealed important details about the origin and spread of HIV 
viruses within the last century. 
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Speciation and Hybridization 





Without gene flow, it is inevitable that there will be speciation. 
M. H. Wolpoff (1989) 


Numerous "species concepts" have been advanced for sexually reproducing organ- 
isms (Mayden 1997), the most historically influential of which are summarized in 
Box 7.1. Most of these concepts entail the perception of conspecific populations as a 
field for gene recombination— that is, as an extended reproductive community with- 
in which genetic exchange potentially takes place. For example, under the popular 
biological species concept (BSC) championed by Dobzhansky (1937), species are en- 
visioned as "groups of actually or potentially interbreeding natural populations 
which are reproductively isolated from other such groups" (Mayr 1963). Many au- 
thors have expressed sentiments on the BSC similar to those of Ayala (1976b): 


Among cladogenetic processes, the most decisive one is speciation—the process by 
which one species splits into two or more. ... Species are, therefore, independent evolu- 
tionary units. Adaptive changes occurring in an individual or populationymay be ex- 
tended to all members of the species by natural selection; they cannot, however, be 
passed on to different species. 


Thus, under the BSC and related concepts, species are perceived as biological and evo- 
lutionary entities that are more meaningful and perhaps less arbitrary than other taxo- 
nomic categories such as subspecies, genera, or orders (Dobzhansky 1970; Howard 
and Berlocher 1998). Nonetheless, several complications attend the application of bio- 
logical (or other) species concepts. 

One difficulty of the BSC involves the discretionary judgments that are often re- 
quired about the specific status of closely related forms in allopatry (and also of extant 
forms to their evolutionary ancestors). Inevitably, reproductive isolating barriers, or 
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BOX 7.1 Representative Species Concepts and Definitions 





1. Biological species concept (BSC) (Dobzhansky 1937): "Species are systems of pop- 
ulations: the gene exchange between these systems is limited or prevented by 
a reproductive isolating mechanism or perhaps by a combination of several 
such mechanisms." 
Comment: Unquestionably the most influential concept for sexually reproduc- 
ing species, the BSC remains popular today. 


2. Evolutionary species concept (ESC) (Simpson 1951): "a lineage (ancestral-descen- 
dant sequence of populations) evolving separately from others and with its 
own unitary evolutionary role and tendencies." 

Comment: Applicable both to living and extinct groups, and to sexual and asex- 
ual organisms. However, this concept is vague operationally in what is meant 
by "unitary evolutionary role and tendencies." 


3. Phylogenetic species concept (PSC) (Cracraft 1983): a monophyletic group com- 

posed of "the smallest diagnosable cluster of individual organisms within 
which there is a parental pattern of ancestry and descent." 
Comment: Explicitly avoids all reference to reproductive isolation and focuses 
instead on phylogenetic histories of populations. A serious problem involves 
how monophyly is to be recognized and how to distinguish histories of traits 
(e.g., gene trees) from histories of organisms (pedigrees). 


4. Recognition species concept (RSC) (Paterson 1985): the most inclusive population 
of biparental organisms which share a common fertilization system. 
Comment: Similar to the BSC in viewing conspecific populations as a field for 
gene recombination. However, this concept shifts the focus away from isolating 
mechanisms as barriers to gene exchange between species and toward the 


"RIBs" (Box 7.2), develop between geographically separated populations as an 
ancillary by-product of genomic divergence, but the time frames involved and 
the magnitudes of differentiation are matters for study in particular instances. 
The “acid test” for biological species status—whether populations retain sepa- 
rate identities in sympatry—often has not been carried out in nature. A second 
practical difficulty involves the issue of how much genetic exchange disqualifies 
populations from status as separate biological species. Thus, the study of specia- 
tion conceptually links the topic of gene flow (see Chapter 6) with that of intro- 
gressive hybridization. Under the BSC, there are no black-and-white solutions to 
either of these difficulties because genetic divergence and speciation are gradual 
processes that in many cases can yield gray outcomes at particular points in evo- 
lutionary time, and because levels of genetic exchange can vary along a continu- 
um from nil to extensive (Dobzhansky 1976). 
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positive role of reproduction-facilitating mechanisms among members of a 
species. Although reproductive barriers can arise as a by-product of speciation, 
under the RSC they are not viewed as an active part of the speciation process. 


5. Cohesion species concept (CSC) (Templeton 1989): “the most inclusive population 
of individuals having the potential for cohesion through intrinsic cohesion 
mechanisms.” 

Comment: Attempts to incorporate strengths of the BSC, ESC, and RSC and 
avoid their weaknesses. The major classes of cohesion mechanisms are genetic 
exchangeability (factors that define the limits of spread of new genetic variants 
through gene flow) and demographic exchangeability (factors that define the 
fundamental niche and the limits of spread of new genetic variants through 
genetic drift and natural selection). 


6. Concordance principles (CP) (Avise and Ball 1990): a suggested means of recog- 
nizing species by the evidence of concordant phylogenetic partitions at multi- 
ple independent genetic attributes. 

Comment: Attempts to incorporate strengths of the BSC and PSC and avoid 
their weaknesses. This approach accepts the basic premise of the BSC, with the 
understanding that the reproductive barriers are to be interpreted as intrinsic 
as opposed to extrinsic (purely geographic) factors. When phylogenetic con- , 
cordance is exhibited across genetic characters solely because of extrinsie barri- 
ers to reproduction, subspecies status is suggested. 


These and other species concepts all have limitations related to the inevitable am- 
biguities of cleanly demarcating continuously evolving lineages (Hey 2001; Hey 
et al, 2003). Thus, for taxonomy, conservation, and other purposes, it may be wis- 
et to accept (rather than bemoan) such uncertainty as simply being inherent in the 
nature of evolutionary processes. 


Another challenge in applying the BSC involves a need to distinguish the 
evolutionary origins of RIBs from their genetic consequences. Normally, repro- 
ductive barriers under the BSC are considered intrinsic biological factors rather 
than purely extrinsic limits to reproduction resulting from geographic separation 
alone. However, this distinction blurs when syntopic populations (those occu- 
pying the same general habitat) are isolated via preferences for different micro- 
habitats, particularly when these ecological proclivities are coupled with differ- 
ences in mate choice (Diehl and Bush 1989). In such situations, one substantive 
as well as semantic issue is whether speciation may have occurred sympatrical- 
ly, versus allopatrically followed by secondary range overlap. Another issue is 
whether certain types of RIBs arise in direct response to selection pressures fa- 
voring homotypic matings (see Box 7.2) or whether they reflect non-selected by- 
products of genomic differentiation that occurred for other reasons. 
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BOX 7.2 Classification of Reproductive Isolating 
Barriers (RIBs) 





1. Prezygotic Barriers 


a. Ecological or habitat isolation: Populations occupy different habitats in the 
same general region, and most matings take place within these microhab- 
itab types. 

b. Temporal isolation: Matings take place at different times (e.g., seasonally 
or diurnally). 

c. Ethological isolation: Individuals from different populations meet, but do 
not mate. 


d. Mechanical isolation: Inter-population matings occur, but no transfer of male 
gametes takes place. 


e. Gametic mortality or incompatibility: Transfer of male gametes occurs, but 
eggs are not fertilized. 


2. Postzygotic Barriers 
a. F, inviability: F, hybrids have reduced viability. 
b. F, sterility: F, hybrids have reduced fertility. 
c. Hybrid breakdown: F,, backcross, or later-generation hybrids have reduced 
viability or fertility. 


One rationale for distinguishing between prezygotic and postzygotic RIBs is that, 
in principle, only the former are directly selectable. Under the "reinforcement" 
scenario of Dobzhansky (1940; Blair 1955), natural selection can act to superim- 
pose prezygotic RIBs over preexisting postzygotic RIBs that may have arisen, for 
example, in former allopatry (Liou and Price 1994). As stated by Dobzhansky 
(1951), “Assume that incipient species, A and B, are in contact in a certain territo- 
ry. Mutations arise in either or both species which make their carriers less likely 
to mate with the other species. The nonmutant individuals of A which cross to B 
will produce a progeny which is adaptively inferior to the pure species. Since the 
mutants breed only or mostly within the species, their progeny will be adaptively 
superior to that of the non-mutants. Consequently, natural selection will favour 
the spread and establishment of the mutant condition.” 

Notwithstanding its conceptual appeal, Dobzhanksy's suggestion has 
proved difficult to verify observationally or experimentally (see review in Butlin 
1989). Koopman (1950) and Thoday and Gibson (1962) provide widely quoted ex- 
amples of selective reinforcement of prezygotic RIBs, but other such experimental 
studies have produced equivocal outcomes (e.g., Spiess and Wilke 1984). It is true 
that "reproductive character displacement" (greater interspecific mate discrimi- 
nation in sympatry than in allopatry) is quite common in nature (Noor 1999). 
However, the extent to which it reflects direct selection against hybrids as op- 
posed to other evolutionary mechanisms (such as spatially varying intensities of 
sexual selection without hybrid dysfunction) remains uncertain (Day 2000; Turel- 
li et al. 2001). 


Po 
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Associated with the speciation process, under any definition, is the conver- 
sion of genetic variability withìn a species to between-species genetic differ- 
ences. However, because RIBs retain primacy in demarcating species under the 
BSC, no arbitrary magnitude of molecular genetic divergence can provide an 
infallible metric to establish specific status, especially among allopatric forms. 
Furthermore, as noted by Patton and Smith (1989), almost "all mechanisms of 
speciation that are currently advocated by evolutionary biologists ... will result 
in paraphyletic taxa as long as reproductive isolation forms the basis for 
species definition" (see below). How, then, can molecular markers inform spe- 
ciation studies? First, molecular patterns provide distinctive genetic signatures 
(Figure 7.1) that often can be related to demographic events during speciation 
or to the geographic settings in which speciation took place (Barraclough and 
Vogler 2000; Harrison 1998; Neigel and Avise 1986; Templeton 1980a). Second, 
estimates of genetic differentiation between populations at various stages of 
RIB acquisition are useful in assessing temporal aspects of the speciation 
process (Coyne 1992). Finally, molecular markers are invaluable for assessing 
the magnitude and pattern of genetic exchange among related forms, and 
thereby can help to elucidate the intensity and nature of RIBs. 


The Speciation Process 


What follows are examples of the diversity of questions in speciation theory 
that have been empirically addressed and answered, at least partially, through 
use of molecular genetic markers. 


How much genetic change accompanies speciation? 


TRADITIONAL PERSPECTIVES. One long-standing view of species differences, 
stated clearly by Morgan early in the last century (1919), is that species differ 
from each other “not by a single Mendelian difference, but by a number of 
small differences.” A counterproposal frequently expressed was that new 
species, or even genera, might arise by single mutations of a special kind— 
"macromutations" or "systemic mutations" (deVries 1910; Goldschmidt 
1940)—that suddenly transform one kind of organism into another. Although ` 
such suggestions for saltational speciation are untenable in their original for- 
mulation, more recent theories have stressed plausible routes by which species 
can arise rapidly, perhaps in some cases with minimal molecular genetic diver- 
gence overall. Examples of such avenues to sudden speciation in plants and 
animals are summarized in Box 7.3. But apart from these "special cases" 
(which nonetheless may be fairly common), can species arise quickly and with 
little genetic alteration? 

One class of arguments for sudden speciation came from paleontology. 
Based on a reinterpretation of "gaps" in the fossil record, Eldredge and Gould 
(1972) proposed that diagrams of the Tree of Life, in which divergence is plot- 
ted on one axis and time on the other, are best represented as "rectangular" 
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(A) 


(B) 


(C) 


(D) 


(E) 





Figure7.1 Five modes of speciation and corresponding gene trees. Shown on the 
right are distributions of allelic lineages in the two daughter species (solid versus gray 
lines). For simplicity, each population is represented as monomorphic, and the gene ge- 
nealogy in each case is {[(a,b)(c)]{(@)(ef)I}. In reality, most populations are likely to be 
polymorphic and hence, upon separation, are expected to evolve through intermediate 
states of polyphyly and paraphyly in the gene tree (see Figure 4.13). (A) Speciation by 
geographic subdivision, with the physical partition congruent with an existing phylo- 
genetic discontinuity. (B) Speciation by subdivision, with the partition not congruent 
with an existing phylogenetic discontinuity. (C) Speciation in a peripheral population. 
(D) Speciation via colonization of a new habitat by propagule(s) from a single source 
population. (E) Local sympatric speciation. (After Harrison 1991.) 
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BOX 7.3 Sudden Speciation 


Several known pathways to rapid speciation entail little or no change in genetic 
composition at the allelic level (beyond the rearrangement or sorting of genetic vari- 
ation from the ancestral forms). 


1. Polyploidization, The origin of stable polyploids usually is associated with hy- 
bridization between populations or species that differ in chromosomal constitu- 
tion. If the hybrid is sterile only because its parental chromosomes are too dis- 
similar to pair properly during meiosis, this difficulty is removed by the 
doubling of chromosomes that produces a polyploid hybrid. Furthermore, such 
a polyploid species spontaneously exhibits reproductive isolation from its pro- 
genitors because any cross with the parental species produces progeny: with un- 
balanced (odd-numbered) chromosome sets. For example, a cross between a 
tetraploid and a diploid progenitor produces mostly sterile triploids. 

Examples: The treefrog Hyla versicolor is a tetraploid that, on the basis of al- 
lozyme and immunological comparisons, is believed to have arisen recently 
from hybridization between distinct eastern and western populations of its 
cryptic diploid relative H. chrysoscelis (Maxson et al. 1977; Ralin 1976). Poly- 
ploidy is relatively uncommon in animals, however, and is confined primarily 
to forms that reproduce asexually (see the section on hybridization in this chap- 
ter). On the other hand, at least 7075-8076 of angiosperm plant species may be 
of demonstrably recent polyploid origin (Lewis 1980), and most plants have - 
probably had at least one polyploidization event somewhere in their evolution- 
ary history. ` 


Especially for recent polyploids, molecular genetic data often provide defini- 
tive evidence identifying the ancestral parental species. For example, allozyme 
analyses established that two tetraploid goatsbeards (Tragopogon mirus and T. mis- 
cellus) arose from recent crosses between the diploids T. dubius and T. porrifolius 
and the diploids T. dubius and T. pratensis, respectively (Roose and Gottlieb 1976). 
These allopolyploids (polyploids arising from combinations of genetically distinct 
chromosome sets) additively expressed all examined protein electrophoretic alleles 
inherited from their progenitors. Similar molecular analyses involving allozymes 
or cpDNA have demonstrated that some polyploid forms are of polyphyletic (#e., 
multi-hybridization in this case) origin, including Asplenium ferns (Werth et al. 
1985), Glycine soybeans (Doyle et al. 1990), Heuchera alumroots (Soltis et al. 1989), 
Plagiomnium bryophytes (Wyatt et al. 1988), and Senecio composites (Ashton and 
Abbott 1992). An especially ingenious application of genetic markers documented 
the complex cytological pathway leading to a new species of tetraploid fern, Asple- 
nium plenum. Using allozyme markers, Gastony (1986) showed that A. plenum 
must have arisen through a cross between a triploid A. curtissii (which had pro- 
duced an unreduced spore) and a diploid A. abscissum (which had produced a nor- 
mal haploid spore). The nearly sterile triploid A. curtissii itself was shown to have 
arisen through a cross between a tetraploid species, A. verecundum, and diploid A. 
abscissum. New autopolyploid taxa (polyploids that arise by a multiplication of one 
basic set of chromosomes) also have been described through molecular assays 


(Rieseberg and Doyle 1989). 
(continued) 
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In the modern era of massive DNA sequencing, genomic dissections of poly- 
ploidization phenomena have become highly sophisticated. For example, by com- 
puter-searching the entire Arabidopsis genomic sequence, Bowers et al. (2003) not 
only identified numerous specific chromosomal segments that had been duplicat- 
ed by a probable polyploidization event after Arabidopsis diverged from most di- 
cots, but also analyzed genomic patterns of subsequent loss or “diploidization” 
(restoration of the diploid condition) for many of these chromosomal regions. 


2. Chromosomal rearrangements. Closely related taxa differing in a variety of struc- 
tural chromosomal features—including translocations, inversions, or chromo- 
some number—may exhibit reproductive isolation for at least three reasons. 
First, some structural differences themselves may cause difficulties in chromo- 
some pairing and proper disjunction during meiosis in hybrids, resulting in 
partial or complete sterility. Second, some gene rearrangements may diminish 
fitness in hybrids through disruptions of gene expression patterns resulting 
from position effects. For either of these reasons, formation of a new species via 
major chromosomal rearrangements probably entails a relatively quick transi- 
tion through an “underdominant” (fitness-diminished) heterozygous phase to a 
condition of population homozygosity for the new karyotype (Spirito 1998). 
Third, structural rearrangements in specific chromosomal regions have the ef- 
fect of reducing recombination (even if not fitness) when in heterozygous con- 
dition, and this reduction per se can also act as a partial barrier to gene ex- . 
change in en retos that differ Karyotypically (Navarro and Barton 
20082). 

Examples: White (19782) compiled evidence that chromosomal rearrangements 
often are involved in the speciation process for animals (see also Sites and 
Moritz 1987), as did Grant (1981) for plants. When.chromosomal rearrange- 
ments have conferred reproductive isolation recently, allelic differentiation be- 
tween the descendant species may still be minimal. Some examples in which 

- the reported magnitude of allozyme divergence between chromosomally differ- 
entiated forms isabout the same as that between populations within a species 
involved subterranean Thomomys and Spalax rodents (Nevo and Shaw 1972; 
Nevo et al. 1974; Patton and Smith 1981) and Sceloporus lizards (Sites and 
Greenbaum 1983). In some of these cases, however, the chromosomal differ- 
ences are not complete barriers to reproduction. 


3. Changes in the mating system. Many plant species exhibit self-incompatibility, 
whereby pollen fail to fertilize ova from the same individual. The mechanisms 
may involve alleles at a self-incompatibility locus that is known to be highly 
polymorphic within some species (Ioerger et al. 1990) or a physical barrier, such 
as a difference in the lengths of styles and stamens (heterostyly), that inhibits 
self-pollination. A switch in mating system, for example, from self-incompati- 
bility to self-compatibility (autogamy) as mediated by a change from heterosty- 
ly to'homostyly can precipitate a rapid speciation event with little change in 
overall genic composition. Other alterations of the breeding system, such as the 
timing of reproduction, similarly can generate reproductive isolation rapidly. 


Examples: In many plant groups, closely related taxa exhibiting contrasting re- 
productive modes suggest that “the evolution of floral syndromes, and their in- 
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fluence on mating patterns, is intimately associated with the development of re- 
productive isolation and speciation” (Barrett 1989). For example, self-compati- 
ble Stephanomeria malheurensis apparently arose from a self-incompatible pro- 
genitor, 5. exigua coronana, and also differs from it by chromosomal 
rearrangements that are the principal cause of hybrid sterility (Stebbins 1989). 
High allozyme similarities (Gottlieb 1973b) suggest that the process took place 
recently, such that the derivative species was extracted from the repertoire of 

genetic polymorphisms present in the ancestor (Gottlieb 1981), Such evolution 
A self-fertilization probably favors the establishment of chromosomal re- 
arrangements that contribute to reproductive isolation of the selfing derivative 
(Barrett 1989), 


branching patterns (Stanley 1975) reflecting evolution through “punctuated 
equilibria.” According to this view, a new species arises rapidly and, once 
formed, represents a well-buffered homeostatic system, resistant to within-lin- 
eage change (anagenesis) until speciation is triggered again, perhaps by an al- 
teration in ontogenetic (developmental) pattern (Gould 1977). A second class of 
arguments for sudden speciation came from molecular genetics: The molecular 
events responsible might involve changes in gene regulation, perhaps mediat- 
ed by relatively few control elements that could have a highly disproportionate 
influence on organismal evolution, including the erection of RIBs (Britten and 
Davidson 1969, 1971; Krieber and Rose 1986; McDonald 1989, 1990; Rose and 
Doolittle 1983; Wilson 1976). A third class of arguments involved demographic 
and population genetic considerations. Mayr (1954) suggested that “founder 
effects” in small geographically isolated populations might produce “genetic 
revolutions” leading to new species. Carson (1968) advanced a "founder-flush" 
model in which rapid population expansion and relaxed selection following a 
severe founder event facilitated the appearance and survival of novel recombi- 
nant genotypes leading to a new species (see also Slatkin 1996). Templeton 
(1980b) introduced a "transilience model" in which speciation involves a fast 
shift to a new adaptive peak under conditions in which founder events cause 
rapid but temporary inbreeding without severely depleting genetic variability. 
Carson and Templeton (1984) compared these models, and Provine (1989) dis- 
cussed their histories. 

The generality of such quick-speciation scenarios proved difficult to docu- 
ment, however. This is in part because speciation is a highly variable and eclec- 
tic process, probably differing greatly in mean tempo and mode in different 
kinds of organisms (such as mobile marine fishes versus sedentary insects). 
Furthermore, even exceptionally rapid speciation is normally an extended tem- 
poral process, seldom directly observable from start to finish during a human 
lifetime (see below). Finally, many of the genetic and demographic events pro- 
posed to be associated with speciation often occur at the population level as 
well without producing new species (e.g., Rundle et al. 1998). 


Pip noD INNER eee ed 





330 Chapter7 


On the other hand, many authors viewed speciation as a rather unexcep- 
tional continuation of the same microevolutionary processes that generate geo- 
graphic population structure, albeit with the added factor of RIB acquisition 
(see early reviews in Barton and Charlesworth 1984; Charlesworth et al. 1982). 
This view was termed “phyletic gradualism" by Eldredge and Gould (1972). 
However, Sewall Wright (1931) and some others who interpreted speciation 
mostly as a continuation of microevolution (Provine 1986) nonetheless empha- 
sized that episodic shifts in evolution could result from genetic drift (in con- 
junction with selection) facilitating rapid leaps across "fitness peaks" in "adap- 
tive landscapes." Paleontologists likewise were long aware that evolutionary 
changes (at least in morphology) often occur in fits and starts, rather than at a 
steady pace (Simpson 1944). Thus, the crucial distinction is not whether evolu- 
tionary change is gradual or episodic, but whether or not speciation as a 
process is somehow uncoupled from processes of intraspecific population dif- 
ferentiation (as Gould proposed in 1980). In earlier approaches to addressing 
these issues, many nonmolecular assessments of the magnitude and pattern of 
genetic differentiation associated with species formation were made. 

These traditional approaches often involved the study of phenotypes in 
later-generation crosses between related species that could be hybridized. One 
method was to measure the variance among F, hybrids for particular behav- 
ioral or morphological characters. Frequently, such variances proved to greatly 
exceed those in either the parental or the F, populations, and few F, hybrids 
fell into the parental classes (DeWinter 1992; Lamb and Avise 1987; Rick and 
Smith 1953). Such results appeared attributable to recombination-derived vari- 
ation, and they indicated that for the assayed characters, the parental species 
must differ in multiple genes, each with small effect (although only the mini- 
mum number of such polygenes can be estimated by this approach; Lande 
1981). Another traditional method of assessing genetic differences between 
species involved chromosomal mapping of prezygotic or postzygotic RIB 
genes through searches for consistent patterns of co-segregation in experimen- 
tal backcross progeny (see reviews in Charlesworth et al. 1987; Richie and 
Phillips 1998; Wu and Hollocher 1998). For example, in sibling species of 
Drosophila, partial hybrid sterility and inviability proved attributable to differ- 
ences at several (mostly anonymous) loci on each chromosome (Dobzhansky 
1970, 1974; Orr 1987, 1992; Wu and Davis 1993), with X-linked genes typically 
having the greatest effects (Coyne and Orr 19892). Unfortunately, such studies 
could only be conducted on model experimental species with well-known ge- 
netic systems. 

There were at least two other serious limitations to these classic Mendelian 
approaches. First, they could be applied only to hybridizable taxa. Second, pat- 
terns of allelic assortment could be inferred only for loci distinguishing the 
parental species; genes that were identical in the parents escaped detection. But 
to determine the proportion of genes distinguishing species, both divergent and 
non-divergent loci must be monitored. Thus, following the introduction of al- 
lozyme methods in the mid-1960s, many researchers revisited the issue of genet- 
ic differentiation during speciation, under the rationale that these molecular as- 
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says permitted, for the first time, examination of a large sample of gene products 
presumably unbiased with regard to magnitude of divergence. Early reviews of 
this effort were provided by Ayala (1975), Avise (1976), and Gottlieb (1977). 


CLASSICAL MOLECULAR EVIDENCE. A classic survey of allozyme differentiation 
accompanying RIB acquisition involved the Drosophila willistoni complex (Ay- 
ala et al. 1975b), which includes populations at several stages of the speciation 
process, as gauged by reproductive relationships and geographic distributions. 
This complex, which is distributed widely in northern South America, Central 
America, and the Caribbean, provides a paradigm of gradual speciation in- 
volving geographic populations that are fully compatible reproductively; dif- 
ferent “subspecies” that are allopatric and exhibit incipient reproductive isola- 
tion in the form of postzygotic RIBs (hybrid male sterility, in this case, in 
laboratory crosses); “semispecies” that overlap in geographic distribution and 
show both postzygotic RIBs and prezygotic RIBs (homotypic mating prefer- 
ences), the latter presumably having evolved under the influence of natural se- 
lection after sympatry was secondarily achieved between subspecies (see Box 
7.2); sibling species that show complete reproductive isolation but remain near- 
ly identical morphologically; and non-sibling species that are phenotypically 
distinct and presumably diverged at earlier times. Frequency distributions of 
genetic similarities across 36 allozyme loci are summarized in Figure 7.2. Be- 
tween subspecies (or semispecies), nearly 15% of these genes showed substan- 
tial or fixed allele frequency differences involving detected replacement substi- 
tutions. Results were interpreted to indicate that a substantial degree of genetic 
differentiation occurs during the first stage of speciation (Ayala et al. 1975b). 
Subsequent analyses of more pairs of closely related Drosophila taxa were re- 
viewed by Coyne and Orr (1989b, 1997). These studies examined allozyme di- 
vergence in a cross section of populations at various stages of the speciation 
process, as defined by geographic distributions and experimentally deter- 
mined levels of prezygotic and postzygotic reproductive isolation. Results 
demonstrated that even partial reproductive isolation is often associated with 
large genetic distances (Table 7.1). 

Other noteworthy early studies demonstrating moderate to large allozyme 
distances between populations at various stages of speciation involved Lepomis 
sunfishes (Figure 7.2), Peromyscus mice (Zimmerman et al. 1978), and He- 
lianthus sunflowers (Wain 1983). In a sense, these and the Drosophila studies 
merely affirm what was emphasized in Chapter 6; that is, that considerable ge- 
netic differentiation among geographic populations can accumulate prior to 
the completion of intrinsic reproductive isolation. 

Among the vertebrates, perhaps the record for magnitude of genetic differ- 
entiation among forms that had been considered conspecific involves the sala- 
mander Ensatina eschscholtzii. This complex of morphologically differentiated 
populations encircles the Central Valley of California in ringlike fashion, with 
adjacent populations usually capable of genetic exchange (Jackman and Wake 
1994; Wake and Yanev 1986; Wake et al. 1986). Remarkably, various popula- 
tions in this “ring species” show allozyme distances up to D = 0.77 (values 
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Figure7.2 Distributions of allozyme loci with respect to genetic similarity (Nei's 
1972 measure) in some of the first multi-locus comparisons among populations at vari- 
ous stages of evolutionary divergence. Shown are results from the Drosophila willistoni 
complex of fruit flies (data from Ayala et al. 1975b) and Lepomis sunfishes (data from 
Avise and Smith 1977). 
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TABLE 7.1 Means and standard errors of genetic distance (Nei's D, allozymes) 
characterizing Drosophila taxa at indicated levels of prezygotic and 
postzygotic reproductive isolation tovs d 


Reproductive Number of .. Mean genetic distance (SE) — 
isolation index® comparisons Prezygotic? Postzygotic" 
0.00 13 0.122 (0.046) 0.138 (0.058) 

0.25 8 0.370 (0.078) 0.251 (0.083) 

0.50 21 0.257 (0.080) 0.249 (0.032) 

0.75 29 0.578 (0.098) 0.722 (0.198) 

1.00 13 0.523 (0.089) 0.991 (0.127) 


Source: After Coyne and Orr 1989b. 

^ Prezygotic isolation index = 1 — [frequency heterotypic matings) / (frequency homotypic matings)]. 
t The postzygotic isolation index is a measure of hybrid inviability and hybrid sterility, scaled 
from zero to one. 


more typically associated with highly divergent congeneric species; see Figure 
1.2). Huge genetic distances (p > 0.12) are apparent in this taxon’s mtDNA 
genome as well (Moritz et al. 1992a). Wake et al. (1989) interpreted the results 
to evidence “several stages of speciation in what appears to be a continuous 
process of gradual allopatric, adaptive divergence,” implying that speciation in 
these salamanders must be extremely slow. On the other hand, Frost and Hillis 
(1990) argued that E. eschscholtzii should instead be considered an assemblage 
of several highly distinct species that separated in allopatry long ago. This ex- 
ample illustrates the kinds of taxonomic uncertainties (not to mention the dan- 
gers of circular reasoning) that can arise in attempts to describe “the amount of 
genetic differentiation during the speciation process.” 

At the opposite extreme, some animals and plants considered distinct tax- 
onomic species show very small allozyme distances (i.e., D « 0.05), well within 
the range of values normally associated with conspecific populations. Early ex- 
amples were reported in herbaceous plants (Ganders 1989; Witter and Carr 
1988), insects (Harrison 1979; Simon 1979), snakes (Gartside et al. 1977), birds 
(Thorpe 1982), and mammals (Apfelbaum and Reig 1989; Hafner et al. 1987). 
Presumably, the paucity of protein electromorphs distinguishing such biologi- 
cal species indicates that insufficient time has elapsed for accumulation of 
greater de novo mutational differences. Indeed, the time-dependent aspect of 
allozyme divergence permitted reassessments of speciation dates. For example, 
two minnow species in California that had been placed in different genera 
(Hesperoleucus and Lavinia) proved to exhibit an allozyme distance of only D = 
0.05, suggesting a far more recent separation than their generic assignments 
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had implied (Avise et al. 1975). In the plant genera Clarkia, Erythronium, Gaura, 
and Lycopersicon, particular progenitor-derivative species pairs were reinter- 
preted to be of relatively recent origin when they were found to exhibit unex- 
pectedly low allozyme distances (Gottlieb 1974; Gottlieb and Pilz 1976; Pleas- 
ants and Wendel 1989; Rick et al. 1976). Conversely, the self-pollinating plant 
Clarkia franciscana was formerly thought to have evolved from C. rubicunda by 
rapid and recent reorganization of chromosomes, but the two species proved to 
share few or no alleles at 75% of allozyme loci, indicating that they had sepa- 
rated much longer ago than formerly supposed (Gottlieb 1973a). 

Overall, with regard to observed magnitudes of genetic divergence and in- 
ferred ecological or evolutionary times associated with speciation, data from 
the allozyme era documented a wide spectrum of outcomes. The same general 
message has emerged from post-allozyme molecular analyses (Harrison 1991), 
such as DNA hybridization (e.g., Caccone et al. 1987) and DNA sequencing 
(see Figure 1.3). Numerous examples appear throughout Chapters 6-8. Despite 
the heterogeneity in genetic patterns (which provides rich fodder for analyzing 
and comparing various speciational processes), the molecular revolution has 
made abundantly clear at least one consistent point: Even closely related taxo- 
nomic species normally differ in many genetic features, not just one or a few. 
Consider, for example, two vertebrate species estimated to differ by a net se- 
quence divergence of 1% (this would be considered a small genetic distance, of 
the approximate magnitude differentiating humans and chimpanzees, for ex- 
ample). If each of these genomes contained 3 billion bp, then a total of about 
30,000,000 nucleotide substitutions woüld distinguish these two species. Al- 
though only a small fraction of these genetic changes might be directly in- 
volved in RIB formation, they would all provide potential molecular markers 
for analyzing a plethora of issues relating to speciation and hybridization. 


IDENTIFICATION AND ANALYSIS OF SPECIATION GENES. A somewhat different re- 
search tack in the post-allozyme era is to focus analyses more directly on "speci- 
ation genes" (Coyne and Orr 1998; Orr 1992), an approach with at least two dis- 
tinct aspects. The first is to employ large banks of molecular markers to identify 
the approximate number and genomic positions of quantitative trait loci, or 
QTLs (see Box 1.2), underlying morphological, behavioral, or other features that 
distinguish particular species of interest (Albertson et al. 2003; Streelman et al. 
2003; Via and Hawthorne 1998). The second is a "candidate gene" approach, 
wherein specific loci already known or suspected to play an important role in 
RIB formation are analyzed in depth, phylogenetically or functionally or both. 

Orr (2001) reviewed the literature on QTL-mapped genetic differences for 
more than 50 traits between closely related pairs of animal and plant species 
(Table 7.2). Many of these phenotypes—such as differing floral traits in related 
flower species, courtship songs in crickets (K. L. Shaw 1996, 2000), and genital 
morphologies and pheromone hydrocarbons in fruit flies (Coyne 1996; Coyne 
and Charlesworth 1997; J. Liu et al. 1996)—contributed directly to RIBs them- 
selves (Bradshaw et al. 1995, 1998). The number of genes provisionally identi- 
fied ranged from 1 to nearly 20 per trait. 
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f TABLE7.2 Genetic analyses, via QTL mapping, of species' differences in numerous 





| phenotypic traits 
1 Sister species Minimum 
| compared Phenotypic character number of genes 
Drosophila fruit flies Adult toxin resistance 5 
: Larval toxin resistance 3 
Oviposition site preference 2 
Fine larval hairs 1 
Various posterior lobe features 4to 19 each 
Sex comb tooth number 2 
Testis length 7 
Cyst length 3 
: Tibia length 5 
| Male pheromone 5 
Female pheromone 5 
Various counts of bristle number 1to6 each 
i Fifth sternite 1 
| Anal plate area 3 
Cuticular hydrocarbon profile 1 
Male courtship song 2 
| Nasonia wasps Wing size 2 
| Laupala crickets Song pulse rate 8 
| Mimulus Concentrations of various 1to3 each 
| monkeyflowers pigments 
| Lateral petal width 8 
j Various corolla features 4to 10 each 
| Petal reflexing 4 
| Nectar volume 3 
Stamen lengths in various 3to 7 each 
species comparisons 
Pistil lengths in various species 1to 13 each 
comparisons 
Bud growth rate 8 
Anther-stigma separation 2to5 each 
in various comparisons 





Source: Modified from a review by Orr (2001). 


However, several cautionary points should be made about such genetic 
appraisals. First, the statistical power to detect significant associations varied 
considerably across studies, so not all results are directly comparable. Second, 
such tallies alone do not accurately describe the distribution of the magnitude 
of phenotypic effects across loci, a particular weakness being in the identifica- 
tion of genes with small or modest effect. Thus, the number of polygenes con- 
tributing to a trait is normally underestimated by this QTL approach, and a re- 
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porting bias exists toward the notion that most identified genes have substan- 
tial effects on phenotype. Third, critical attention in QTL mapping should be 
devoted to evaluating epistasis and dominance—that is, non-additive allelic in- 
teractions on phenotype among and within loci, respectively (Turelli and Orr 
2000)—but unfortunately these phenomena are often neglected (but see Kim 
and Rieseberg 2001, for a nice exception). Finally, such QTL studies can be ap- 
plied only to hybridizable species pairs, and normally to phenotypic traits that 
differ cleanly between them. The net effect of these and other qualifications 
about QTL mapping, plus the wide variety of observed outcomes reported to 
date, led Orr (2001) to a cautious conclusion: "Such results do not encourage 
the idea that the genetics of species differences shows any regularities." 

Nevertheless, considering the total number of phenotypic differences often 
distinguishing congeners and the minimum tallies of responsible genes per trait 
in several taxa studied to date by QTL mapping, the composite number of ge- 
netic changes between even closely related species must normally be large. For 
example, Kim and Rieseberg (1999) identified 56 QTL loci contributing to differ- 
ences in 15 morphological traits between closely allied species of Helianthus sun- 
flowers. Furthermore, the number of genes deduced to contribute directly to 
prezygotic and postzygotic RIBs seems to be fairly substantial in available stud- 
ies (see Table 7.2; Civetta et al. 2002; Hollocher and Wu 1996; Suwamura et al. 
2000). On the other hand, in many animal and plant species, the number of ge- 
netic changes distinguishing long-separated allopatric populations may also be 
large, so an important future task will be to conduct similar kinds of QTL map- 
ping experiments on conspecific populations and compare the results to the in- 
terspecific outcomes. Another way to look at this challenge is to appreciate that 
results from available QTL mapping studies provide tallies of accumulated dif- 
ferences between species, but that some or all of these genetic changes might 
have either predated or postdated completion of the speciation process itself 
(the same cautionary note applies to nearly all molecular analyses of genetic dif- 
ferences between extant species). Furthermore, although QTL analyses identify 
chromosomal regions contributing to phenotypic differences, they do not by 
themselves identify the actual genes that are mechanistically responsible. 

The second approach, known as the "candidate gene" method (Haag and 
True 2001), focuses even more directly on particular loci suspected to play an 
immediate role in providing RIBs between sister species. These analyses can be 
phylogenetic, functional, or both. A phylogenetic appraisal is illustrated by 
studies of Odysseus (OdsH), a gene known to be responsible for hybrid male 
sterility in fruit flies (Perez et al. 1993). Phylogenetic analyses of sequence poly- 
morphisms in Drosophila mauritiana and D. simulans showed that these closely 
related species are reciprocally monophyletic in the OdsH gene tree, but not in- 
variably so at other loci not directly involved in RIB formation (Ting et al. 2000). 
The authors concluded that RIB-causing genes such as OdsH faithfully track the 
evolutionary history of reproductive isolation (i.e., speciation per se), whereas 
non-RIB loci are expected to display a much wider variety of phylogenetic pat- 
terns due to evolutionary factors such as retention of ancestral polymorphisms 
or post-speciation movement of genes via hybridization (see also Wu 2001). 
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Another example involved a detailed molecular dissection of a gene 
(Nup96) in Drosophila that encodes a nuclear pore protein (Presgraves et al. 
2003). Found on chromosome 3, Nup96 interacts with one or more unknown 
genes on the X chromosome. Within either D. simulans or D. melanogaster, the 
protein products of these genes apparently interact well together, but in hy- 
brids between them, the genes interact epistatically to reduce viability severe- 
ly. The authors demonstrate that this nuclear pore protein evolved by positive 
natural selection in both species' lineages such that the hybrid inviability is 
merely a by-product of adaptive intraspecific protein evolution. 
Another nice empirical example of this sort involved experimental analy- 
ses of functional coadaptation between two cellular proteins (cytochrome c 
and cytochrome c oxidase) in a marine copepod, Tigriopus californicus. These 
two proteins interact during a final step of the mitochondrial electron trans- 
port system (ETS), which plays a central role in cellular energy production. In 
laboratory assays, Rawson and Burton (2002) discovered that cytochrome c 
variants isolated from each of two genetically divergent copepod populations 
had significantly higher reactivity with cytochrome c oxidase molecules de- 
rived from their own population than with those from the alien population. 
These results indicate the presence of positive epistatic interactions between 
coadapted ETS proteins. It is also known that inter-population crosses in T. 
californicus yield later-generation hybrids with reduced performance in a wide 
variety of fitness-related traits (e.g., Edmands 1999). Taken together, these 
studies suggest that any hybridization between genetically divergent cope- 
pods would disrupt the coadapted ETS complex, thereby contributing to func- 
tional incompatibilities in cellular respiration that underlie partial reproduc- 
tive isolation. 
Functional as well as population genetic analyses of speciation genes are es- 
pecially well illustrated by studies of "gamete recognition" loci (Palumbi 1998; 
Snell 1990), a subcategory within the broader set of genes responsible for prezy- 
gotic RIBs between many closely related species (Howard et al. 1998). Most ma- 
rine invertebrates release their gametes into the water, so sperm and eggs of 
| each species must find and recognize each other for successful fertilization. The 
| cellular mechanisms involved have proved to be diverse. In echinoderms, for 
example, gametic attachment and fusion are mediated by sperm bindin prọteins 
| that interact with carbohydrates attached to proteins on the egg surface 
| (Palumbi 1999; Vacquier et al. 1995), whereas in mollusks, a lysin protein medi- 
| ates how well a sperm can burrow through an egg's chorion layer (Vacquier and 
Lee 1993). In addition to functional studies of these genes' modes of action, pop- 
ulation genetic analyses of DNA sequences have shown that characteristic re- 
gions in the bindin and lysin molecules evolve rapidly both within and among 
related species (Lee et al. 1995; Metz and Palumbi 1996; Swanson et al. 2001a). 
| Observed rates and patterns of amino acid substitution also indicate that these 
gamete recognition loci are often under positive diversifying selection. One hy- 
pothesis is that sperm in general should be under strong selection for rapid egg 
entry because, in the open ocean, any sperm cell is likely to encounter at most 
| only one egg and must take advantage of the opportunity; whereas eggs are un- 
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der strong selection for defense against untoward sperm advances because a 
typical egg may encounter swarms of sperm cells, only one of which is geneti- 
cally required (Palumbi 1998). One net result is that eggs and sperm may be en- 
gaged in a coevolutionary "arms race" of offensive and defensive tactics that 
promote rapid molecular evolution in gamete recognition genes (Rice 1998). An- 
other evolutionary consequence is an opportunity for rapid reproductive char- 
acter displacement at gametic recognition loci as a barrier to detrimental hy- 
bridization between species (Geyer and Palumbi 2003). 

In species with internal fertilization, it has been found that proteins inti- 
mately associated with reproduction also often evolve extremely rapidly, prob- 
ably for similar kinds of selective reasons. Examples of genes showing rapid se- 
quence evolution in replacement sites include the zona pellucida genes whose 
glycoprotein products coat the eggs of mammals (Swanson et al. 2001b) and 
genes for accessory gland proteins that occur in the seminal fluid of Drosophila 
males (Begun et al. 2000; Swanson et al. 2001c). 


Do founder-induced speciations leave definitive genetic signatures? 


All the sudden modes of speciation described in Box 7.3 no doubt are initiated 
by very small numbers of individuals who first acquire the relevant chromoso- 
malor reproductive alterations. Apart from such situations, do founder events 
underlie speciations in many other animal and plant groups? If so, such speci- 
ations might entail significant shifts in frequencies of ancestral polymorphisms, 
but at the outset probably little de novo sequence evolution, and the ancestral 
species would normally be paraphyletic with respect to the derivative for at 
least some evolutionary time following their separation. A severe and pro- 
longed population bottleneck accompanying speciation should also greatly di- 
minish genetic variability in the neospecies. 

A remarkable radiation of drosophilid flies has occurred in the Hawaiian 
archipelago, which is home to about 800 species endemic to the islands, com- 
pared with about 2,000 species in the remainder of the world (Carson and 
Kaneshiro 1976; Wheeler 1986). Founder-induced speciation models figured 
prominently in early discussions of the prolific speciation among Hawaiian 
Drosophila (Giddings et al. 1989), in which species formation was postulated to 
follow the colonization of new islands, perhaps by one or a small number of 
gravid females. However, molecular genetic data seemed to be equivocal about 
these scenarios. Some sister species, such as D. silvestris and D. heteroneura, did 
indeed prove to exhibit high allozyme similarities suggestive of recent specia- 
tion (Sene and Carson 1977). On the other hand, many recently derived Hawai- 
ian species proved to be no less variable genetically than typical continental 
Drosophila, a result used by Barton and Charlesworth (1984) to dispute the 
founder model, but defended by Carson and Templeton (1984) as consistent 
with the founder-flush and transilience models of speciation. D. silvestris and 
D. heteroneura also showed relatively high genotypic and nucleotide diversities 
in mtDNA (DeSalle et al. 1986a,b), further suggesting to Barton (1989) that 
founder-induced speciation was not involved. 
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In the Australian fish Galaxias truttaceus, a landlocked form of which consti- 
tutes an incipient species separated from coastal ancestors within the last 
3,000-7,000 years (based on geological evidence), surveys of both allozymes and 
mtDNA variation have been conducted (Ovenden and White 1990). Of the total 
of 58 mtDNA haplotypes observed in coastal populations, only two character- 
ized the landlocked forms. Heterozygosities at allozyme loci nonetheless were 
nearly identical in the landlocked and coastal populations. These genetic results 
were interpreted to indicate that a severe but transitory population bottleneck ac- 
companied the transition to lacustrine habitat (because in principle, such bottle- 
necks might affect the genotypic diversity of mtDNA more than that of nDNA). 

In terms of gene genealogy, a founder-induced speciation should initially 
produce a paraphyletic relationship between the ancestral and descendant 
species (see Figure 7.1C-E). Many examples of paraphyly in mtDNA or scn- 
DNA gene trees have been reported for related species (Avise 2000a; Powell 
1991). Indeed, a recent literature survey of more than 2,200 species identified 
more than 500 cases (ca. 23%) in which a paraphyletic relationship between re- 
lated taxonomic species had been statistically documented (by bootstrap crite- 
ria) in mtDNA gene trees (Table 7.3). Four empirical examples are illustrated in 
Figure 7.3. For example, the deer mouse (Peromyscus maniculatus) occupies 
most of North America and exhibits a paraphyletic relationship to the old-field 
mouse (P. polionotus), a species confined to the southeastern United States. Sim- 
ilarly, the mallard duck (Anas platyrhynchos), with broad Holarctic distribution, 
appears paraphyletic in mtDNA genealogy to the American black duck (Anas 
rubripes), which inhabits eastern North America only (Avise et al. 1990a). As de- 
tailed in Chapter 6, perhaps the most unexpected and remarkable of such ex- 
| amples involves the brown bear (Ursus arctos), which appears paraphyletic in 
matrilineal pattern to the polar bear (Ursus maritimus) despite these species' 
grossly different phenotypes (see Figure 6.9). 

However, for the species depicted in Figure 7.3 and others like them, the 
mere appearance of genealogical paraphyly in a gene tree, or even in a com- 
posite organismal phylogeny, is insufficient for concluding that founder-in- 
duced speciations necessarily were involved, for several biological reasons (i.e., 
apart from “bad taxonomy,” mistaken gene trees, or other artifactual causes). 
First, paraphyly is expected even in the absence of severe population bottle- 
necks when a derivative, geographically restricted species emerges via gradual 
allopatric divergence (Figure 7.1C). Second, under most geographic modes of 
speciation entailing even moderate or large populations, paraphyly in gene 
trees is a fully anticipated stage preceding reciprocal monophyly and often fol- 
lowing genealogical polyphyly (see Chapter 4). Finally, the appearance of para- 
phyly also can result from secondary introgressive hybridization that has 
transferred some allelic lineages from one species to another (see examples in 
Freeland and Boag 1999; Shaw 2002; Sota and Vogler 2001; and the section on 
hybridization below). The latter possibility has been invoked, for example, to 
account in part for the paraphyly or polyphyly of the mallard duck to the black 
duck and other related species with which it often hybridizes extensively (Mc- 
Cracken et al. 2001; Rhymer et al. 1994). 
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Figure 7.3 Empirical examples of genetic paraphyly for closely related species. 

(A, B) mtDNA gene trees for Peromyscus mice and Anas ducks (see text for references). 
(C) mtDNA gene tree for Bufo toads (Slade and Moritz 1998). (D) cpDNA and nuclear 
rDNA gene trees for Helianthus sunflowers (Rieseberg and Brouillet 1994). (After 
Avise 2000a.) 


The influences of founder events on patterns of genetic differentiation 
among species and on magnitudes of genic variability within species are in the- 
ory also functions of the size and duration of each population bottleneck and 
the subsequent rate of population growth when variation recovers (Nei et al. 
1975). All else being equal, uniparentally inherited cytoplasmic genes might 
register founder effects more clearly than autosomal nuclear loci because of 
their expected fourfold lower effective population size (Palumbi et al. 2001). 
However, this is merely a baseline expectation from which departures can arise 
for a variety of biological reasons (Hoelzer 1997; Moore 1995, 1997), even apart 
from the high inherent stochasticity with regard to which lineages happen to 
survive in a bottlenecked (or other) population to contribute to genetic diversi- 
ty in descendants (Edwards and Beerli 2000). Furthermore, firm genetic infer- 
ences about historical demographic events accompanying speciations can also 
be confounded by non-speciational population bottlenecks that may have pre- 
dated and/or postdated erection of RIBs themselves. 
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TABLE 7.3 Instances of statistically documented paraphyly (including"polyphyiy") 
uncovered in a literature survey of mtDNA gene trees for congeneric 
animal species 








Taxonomic Number Number Number 96 of species 
group of studies of genera of species paraphyletic 
Mammals 139 102 469 17.0 
Birds 74 87 331 16.7 
Reptiles 56 45 147 22.4 
Amphibians 35 26 137 213 
Fishes 100 99 371 24.3 
Arthropods 143 126 702 26.5 
Other invertebrates 37 41 162 38.6 
Total 584 526 2319 23.1 





Source: After Funk and Omland 2003. 


Theoretical objections to the founder-induced speciation model have em- 
phasized the low likelihood that small populations can successfully traverse 
major adaptive peaks (Barton 1989, 1996; Barton and Charlesworth 1984) or, in 
general, that they are more predisposed to speciation than large populations 
(Orr and Orr 1996). For example, one recent theory posits that reproductive iso- 
lation is often driven by conflicts of interest between the sexes and so might 
evolve most rapidly in large, dense populations, in which this type of selection 
should be most effective (Gavrilets 2000; Martin and Hosken 2003). However, 
not everyone accepts such theoretical objections to speciation in small popula- 
tions (Hollocher 1996), and the original observation that motivated founder-in- 
duced speciation scenarios—namely, that insular populations or those at the pe- 
riphery of a species range often show unusually high phenotypic divergence 
(e.g., Berry 1996)—still holds. In theory, rapid founder-induced speciations 
should be reproducible in appropriate experimental settings, especially in or- 
ganisms with short generation times, but such "population cage" tests (mostly 
in dipteran flies) have yielded equivocal results at best (Moya et al. 1995). In 
summarizing this overall state of affairs, Howard (1998) concluded, “The ques- 
tion of whether small founder populations play an important role in genetic di- 
vergence and speciation is still open, although there is probably less enthusiasm 
for the role of founder events in speciation now than existed a decade ago." 


What other kinds of phylogenetic signatures do past 

speciations provide? 

Several other approaches to translating molecular observations on extant 
species into plausible inferences about the nature and tempo of speciations past 
have been suggested (Harvey et al. 1994; Kirkpatrick and Slatkin 1993; Klicka 
and Zink 1999; Losos and Adler 1995; Nee et al. 1994a; Rogers 1994). Typically, 
these approaches employ phylogenetic methods to assess the shapes of evolu- 
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tionary trees and thereby address whether cladogenesis across time departs 
significantly from specified null models of lineage diversification. 

“Lineage-through-time” analyses (Barraclough and Nee 2001; Nee 2001) can 
serve to illustrate the general conceptual orientation of several such approaches 
(Figure 7.4). The lineage-through-time model views speciations and extinctions in 
a supraspecific phylogeny as analogous to births and deaths of individuals ina 
population or gains and losses of lineages in an intraspecific gene tree. It assumes 
that a phylogram is available (e.g., from molecular data) for the extant species un- 
der analysis. It then asks whether the changing number of total lineages in the 
tree, when graphed as a lineages-through-time plot (log scale), might indicate a 
constant uniform rate of speciation in all branches throughout the tree, in which 
case the plot could show a straight line with slope equal to the per lineage specia- 
tion rate; recent accelerated speciation in the tree, in which case the plot could be 
concave upward; or recent decelerated speciation, in which case the plot could be 
concave downward. These expectations are somewhat equivocal, however 
(Kubo and Isawa 1995). For example, a concave downward curve might also be 
interpreted to register a recent increase in the extinction rate (Nee et al. 1994b). 

Notwithstanding such caveats, this and alternative statistical phylogenetic 
approaches (e.g., Wollenberg et al. 1996) have been employed, provisionally, to 
infer nonrandom patterns of cladogenesis in large evolutionary clades ranging 
from Cicindela tiger beetles (Barraclough et al. 1999) to Dendroica warblers 
(Lovette and Bermingham 1999). For example, statistical analyses of the shapes 
of molecule-based phylogenies provided evidence for a significant temporal 
clustering of ancient cladogenetic events in a "species flock" of Sebastes rock- 
fishes in the North Pacific (Johns and Avise 1998b), which provided an interest- 
ing comparison to recent explosive cladogenesis in a species flock of African 
freshwater cichlid fishes (to be described later in this chapter). 


Are speciation rates and divergence rates correlated? 


One intriguing possibility is that speciation events themselves might accelerate 
evolutionary differentiation within clades. If so, then magnitudes of divergence 
between extant species could be proportional to numbers of speciation events in 
their evolutionary histories, rather than to elapsed times since common ancestry. 
With regard to morphological divergence, this is indeed a logical consequence of 
the original model of punctuated equilibrium (Eldredge and Gould 1972), which 
posited stasis for organismal lineages except during speciation events. To test this 
possibility at the genetic level, Avise and Ayala (1975) introduced a conceptual 
approach that involves comparing pairwise genetic distances within clades that 
have experienced different rates of speciation. If genetic divergence is propor- 
tional to time, then mean genetic distances among extant species should be simi- 
lar in rapidly speciating (species-rich) and slowly speciating (species-poor) 
clades of similar evolutionary age, whereas if genetic divergence is a function of 
the number of speciation events, mean genetic distance among extant forms 
should be obviously greater in the species-rich clade (Figure 7.5). 
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Figure7.4 Simplified illustration of the use of "lineages-through-time" plots (cen- 
ter) for inferring past cladogenetic patterns from the shapes of phylogenetic trees (top 
and bottom). (In actual cases, the clades to be examined would be much larger, and 
each lineage-through-time plot would be on a logarithmic scale; Barraclough and Nee 
2001.) The method involves tallying and graphing the number of species lineages at 
successive temporal points to assess possible departures from constant rates of specia- 
tion or extinction through time (see text). 


aes ee ee MR a SE ^] 





et te 


344 Chapter7 


Distance matrix for phyiad P 
A B C D 





Aj — 2 4 4 








Phyletic gradualism: 
Dp = 20/6 = 3.33 
Dg = 392/120 = 3.27 
Dg/Dyp = 0.98 


Punctuated equilibrium: 
Dp = 14/6 = 233 
Dg = 664/120 = 5.53 
Dg /Dp = 2.37 


onov» 


———w€—— 
4 . 3 2 1 


Time 





Figure 7.5 Explanation of models underlying a test for whether rates of genetic di- 
vergence and rates of speciation are correlated. “R” and "P" are species-rich and 
species-poor clades of comparable evolutionary age. The distance matrix in the upper 
right applies to clade P (the larger matrix for R is not presented) and shows expected 
distances between species pairs when differentiation is either time-dependent 
("phyletic gradualism," above diagonal) or speciation-dependent ("punctuated equi- 
librium,” below diagonal). At the lower right are shown expected ratios of mean dis- 
tances (D values) for extant species within R and P under these competing models. Un- 
der phyletic gradualism, Dp / Dp = 1.0, whereas under punctuated equilibrium, D, /Dy 
»» 1.0. (After Avise and Ayala 1975.) 


One set of empirical tests utilizing this approach involved the speciose 
North American minnows (approximately 200 species, more than 100 then 
recognized within Notropis alone) and the relatively depauperate sunfishes 
(approximately 30 species, 11 within Lepomis). Based on fossil evidence, these 
groups both appeared to have Miocene origins on the continent. A key as- 
sumption underlying the test was that the minnows had speciated more rap- 
idly than the sunfishes (rather than having experienced lower rates of extinc- 
tion). Allozyme comparisons involving more than 80 species (Avise 1977a; 
Avise and Ayala 1976) revealed nearly identical mean genetic distances 
among species within the two groups (e.g., D = 0.62 and D = 0.63 in Notropis 
and Lepomis, respectively), as well as similar mean heterozygosities per 
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species (H = 0.052 and H = 0.049; Avise 1977b). These results were provision- 
ally interpreted as inconsistent with punctuated equilibrium with regard to 
allozyme evolution in these fishes. 

Gould and Eldredge (1977) justifiably questioned the relevance of these 
findings to the broader debate between punctuated equilibrium and phyletic 
gradualism, pointing out that the controversy referred primarily to patterns of 
morphological divergence, whereas allozymes were neutral molecular charac- 
ters and, hence, irrelevant to the discussion. Douglas and Avise (1982) respond- 
ed by reexamining these fishes for quantitative morphological differences and 
showed by multivariate statistical analyses that the overall magnitudes of phe- 
notypic divergence were closely similar in the minnows and the sunfishes, a re- 
sult again inconsistent with some of the predictions of punctuated equilibrium. 

Nonetheless, these particular tests had several acknowledged weaknesses. 
For example, minnow and sunfish taxa originally were recognized on the basis 
of qualitative morphological appraisals, so the quantitative reassessment by 
Douglas and Avise (1982) probably included elements of “circular” reasoning 
in this supposed test of “rectangular” evolution! Another concern, voiced by 
Mayden (1986), was that the minnows examined might not be monophyletic. 
However, if the North American minnows do not constitute a clade, then the 
actual intervening number of speciation events would be even greater than as- 
sumed, in which case the minnows should have displayed even larger mean 
genetic distances under the rectangular evolution model. So non-monophyly 
should have biased the outcome toward (rather than against) the predictions of 
punctuated equilibrium. In any event, it is true that such tests ideally should 
involve sister clades so that, by definition, the evolutionary times over which 
the differential species proliferations took place would be identical. However, it 
is difficult to find sister clades that are differentially branchy or “unbalanced” 
(Heard 1996) on opposite sides of a basal node. 

Despite the importance of this topic, relatively few explicit tests of this sort 
involving molecular data from extant species have been attempted (but see 
Barraclough and Savolainen 2001; Lemen and Freeman 1981, 1989; McCune 
and Lovejoy 1998; Ricklefs 1980). Using published data from DNA hybridiza- 
tion and mtDNA sequences to contrast numerous independent pairs of tropical 
and temperate avian taxa, Bromham and Cardillo (2003) tested and proyision- 
ally rejected a hypothesized link (Rohde 1992) between rates of speciation and 
rates of molecular evolution along global latitudinal gradients. Mindell et al. 
(1990) reported a different outcome when they summarized available literature 
on allozyme genetic distances within 111 vertebrate genera and noted a posi- 
tive correlation between genetic divergence and species richness, which they 
attributed to the accelerating influence of speciation on molecular differentia- 
tion. However, among taxa this diverse, one cannot be certain that other un- 
controlled variables do not also correlate with, and possibly influence, molecu- 
lar rates. Similarly, in a recent review of 56 published molecular phylogenies, 
Webster et al. (2003) identified a significant association between number of spe- 
ciations and rate of genetic evolution in 30%-50% of the evolutionary trees. Re- 
sults were interpreted to suggest that molecular clocks are not gradual, but 
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rather are punctuated according to speciation events. On the other hand, at 
least half of those analyzed phylogenetic trees showed no significant correla- 
tion between rate of molecular evolution and apparent rate of speciation, so 
any association between the two must be either non-universal or rather subtle. 


Can speciation occur sympatrically? 


A long-standing issue in evolutionary biology concerns how often new species 
arise sympatrically, most likely under the influence of diversitying selection on 
resource utilization or mate choice (Levine 1953; Maynard Smith 1966; Schluter 
1996a, 2000; Tauber and Tauber 1989). Some biologists question whether sym- 
patric speciation plays any significant role in evolution (e.g., Coyne 1992; 
Felsenstein 1981b; Futuyma and Mayer 1980; Mayr 1963), whereas others per- 
ceive it as a prevalent mode in at least some taxonomic groups, such as insects 
(e.g., Berlocher and Feder 2002; Bush 1975; Dieckmann and Doebeli 1999; Hi- 
gashi et al. 1999; Kondrashov and Mina 1986). Here, several examples will be 
provided in which molecular markers have shed some light on the possibility 
of sympatric speciation. 


HOST SHIFTS OR HABITAT SWITCHING IN INSECTS. Changes in host usage by 
phytophagous or zoophagous parasites might give quick rise to new host races 
or neospecies that are reproductively isolated from their sympatric progenitors 
(Bush 1975; Wood et al. 1999). Host-specific parasites that faithfully complete 
their entire life cycle on a particular host species are especially strong candi- 
dates for sympatric speciation (or at least for local syntopic speciation). In such 
cases, any rare shift in host utilization almost by definition causes a simultane- 
ous shift in mating options, such that ecological differentiation and prezygotic 
isolation go hand in hand (Hawthorne and Via 2001; Kondrashov and Kon- 
drashov 1999). For any such speciation event that took place recently and in- 
volved a small number of founders initiating the host shift, population genetic 
signatures should include a reduction of variation in the neospecies and a pa- 
raphyletic relationship of the progenitor species to the derivative. About a 
dozen additional phylogeographic signatures of various types of sympatric 
speciation events have been suggested (Via 2001), but only seldom do these 
alone permit firm elimination of all competing hypotheses (Berlocher 1998). 
The classic example of host switching involves frugivorous Rhagoletis flies, in 
which several host-specific forms, such as an apple race and a hawthorn race of 
R. polmonella, are postulated to be undergoing sympatric speciation in modern 
times (Bush 1969; Linn et al. 2003). Available genetic data are not inconsistent 
with this possibility. Collections of flies from apple and hawthorn trees in local 
sympatry show small but significant allozyme differentiation (Feder et al. 1988, 
1997; McPheron et al. 1988), and such differences between paired samples extend 
across the eastern United States and Canada (Feder et al. 1990a,b). Furthermore, 
Berlocher and Bush (1982; see also Berlocher 2000; Smith and Bush 1997) con- 
cluded from molecular data that the phylogeny for many Rhagoletis flies and 
their relatives differs from that of their hosts, a result consistent with host-switch- 
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ing aspects of sympatric speciation scenarios (see below). A recent phylogeo- 
graphic analysis based on nuclear and mitochondrial loci added a twist to sym- 
patric speciation scenarios for Rhagoletis: An ancestral, hawthorn-infesting popu- 
lation apparently was subdivided about 1.5 million years ago into Mexican and 
North American isolates that diverged in adaptive diapause traits; subsequent 
gene flow from Mexico may then have aided North American flies in adapting to 
a variety of plants with different fruiting times, thereby helping to spawn new 
species, perhaps sympatrically (Feder et al. 2003). 

Molecular markers have likewise indicated genetic differences between 
several other insect "races" suspected of undergoing host-shift speciations, such 
as gallmaker flies (Eurosta; Brown et al. 1996; Waring et al. 1990), pea aphids 
(Acyrthosiphon; Via 1999), yucca moths (Prodoxus; Groman and Pellmyr 2000), 
and seed-parasitic moths (Greya; Brown et al. 1997). About 2596-4096 of all ani- 
mal species are thought to be phytophagous specialists (Berlocher and Feder 
2002), so even if sympatric speciation is confined to a modest fraction of such or- 
ganisms, it could be highly important to speciation theory and biodiversity. 


FLOCKS OF FISHES. Within each of several isolated lakes or drainages scattered 
around the world, particular groups of related fishes constitute “species flocks” 
(Echelle and Kornfield 1984) that might have diversified through sympatric 
speciation. The numbers of named species in such fish flocks range from just a 
few, as in salmonid complexes in high-latitude lakes of the Northern Hemi- 
sphere, to many hundreds, as in cichlids in some of the Rift Valley lakes of east- 
ern Africa (Fryer and Iles 1972; Greenwood 1981). As gauged by molecular ge- 
netic evidence, some of these flocks are evolutionarily old (e.g., Cottus sculpins 
in Russia's Lake Baikal; Grachev et al. 1992), whereas others are extremely 
young (e.g., cichlids in Africa's Lake Victoria; Stiassny and Meyer 1999). 

Especially for the youngest flocks, one preliminary question is whether dif- 
ferences among sympatric morphotypes indeed reflect the presence of distinct 
species (i.e., reproductively isolated gene pools), rather than intraspecific pheno- 
typic polymorphisms, perhaps due to ontogenetic switches triggered by ecologi- 
cal conditions. Molecular markers are well suited to address this question. For 
example, allozyme markers were used to examine Cichlasoma fishes in an isolat- 
ed basin near Coahuila, Mexico, where three trophic morphs described as sepa- 
rate species occur: a snail-eating form with molariform teeth and a shortgut, a 
detritus-feeding or algae-eating form with papilliform teeth and a long gut, and 
a fish-eating form with a fusiform body. At all 27 monomorphic and polymor- 
phic loci assayed, these morphs proved to be indistinguishable (Kornfield and 
Koehn 1975; Kornfield et al. 1982), a result interpreted by Sage and Selander 
(1975) to indicate that trophic radiation in these cichlids was achieved by ecolog- 
ical polymorphism, not speciation. Aquarium experiments confirmed this con- 
clusion when it was shown that distinct morphotypes could be generated among 
progeny within a brood by altering the rearing conditions (Meyer 1987). 

A similar report of dramatic trophic polymorphism due to phenotypic 
plasticity involved Ilyodon fishes. In some Mexican streams, sharply dichoto- 
mous trophic morphs, formerly considered distinct species, were shown to be 
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indistinguishable at several polymorphic allozyme loci, and pooled genotypic 
frequencies conformed to Hardy-Weinberg expectations for randomly mating 
populations (Grudzien and Turner 1984). These data strongly suggested that 
the trophic types are conspecific (Turner and Grosse 1980). 

Within many salmonid species, coexisting forms often exhibit contrasting 
life histories: Anadromous individuals hatched in streams or lakes migrate to 
sea before returning to fresh water to spawn, whereas non-anadromous indi- 
viduals spend their entire lives in fresh water. Some landlocked populations, 
lacking present-day access to the ocean, include both stream-resident and lake- 
migratory individuals. Do these various life history types at a given site consti- 
tute separate gene pools? Allozyme studies of several such complexes in 
species such as sockeye salmon (Oncorhynchus nerka; Foote et al. 1989), rainbow 
trout (O. mykiss; Allendorf and Utter 1979), cutthroat trout (O. clarki; Campton 
and Utter 1987), brown trout (Salmo sutta; Hindar et al. 1991), and Atlantic 
salmon (S. salar; Ståhl 1987) revealed high genetic similarities between migrato- 
ry and resident fish spawning in the same area, but significant differences be- 
tween spawning populations in disjunct geographic regions (Ferguson 1989; 
Ryman 1983; see also Chapter 6). In other studies, small but statistically signifi- 
cant genetic differences were demonstrated among nearby or sympatric 
salmonids that differed in lifestyle (Baby et al. 1991; Birt et al. 1991; Krueger 
and May 1987; Skaala and Naevdal 1989; Vuorinen and Berg 1989; see review in 
Schluter 1996b). Some of the impediments to gene flow probably reflect habitat 
structure per se, but in some cases innate (gene-based) differences in microhab- 
itat preferences or spawning times apparently augment these external barriers 
(Allendorf 1996). 

Whether or not migratory and resident salmonid populations at a particu- 
lar locale are fully isolated reproductively at the present time, the small genetic 
distances often involved and the polyphyletic appearance of particular life his- 
tory patterns across a species' range suggest that the isolations are evolutionar- 
ily ephemeral. Clearly, lifestyle switches can be rapid and common, so any con- 
temporary genetic separations are probably of recent origin (as indicated also 
by the fact that most of the high-latitude locales under consideration were cov- 
ered by glacial ice as recently as 10,000 years ago). Furthermore, rearing and 
tagging studies of brown trout have shown that freshwater-resident individu- 
als can develop from anadromous parents, and vice versa (Skrochowska 1969), 
indicating a considerable element of phenotypic plasticity in lifestyle. In this 
case, the freshwater-resident lifestyle is plausibly associated with a slow 
growth rate of parr (Hindar et al. 1991), which itself is influenced by both ge- 
netic and environmental factors. 

Molecular analyses of several other flocks of fishes have revealed signifi- 
cant differences in allozymes or mtDNA among sympatric forms, thus con- 
firming the presence of multiple biological species. Examples include represen- 
tatives of the atherinids of central Mexico (Echelle and Echelle 1984), 
cyprinodontids of eastern Mexico (Humphries 1984), various cichlids in 
African Rift Valley lakes (Sage et al. 1984; Sturmbauer and Meyer 1992), and 
the now-extinct cyprinids of Lake Lanao in the Philippines (Kornfield and Car- 
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penter 1984). However, despite significant allelic differences at particular loci, 
overall genetic distances among species remained remarkably small in several 
of these cases, suggesting that the evolutionary separations were very recent. 

A follow-up question concerning such validated species flocks is whether 
the speciations took place allopatrically or sympatrically. In the Allagash basin 
of eastern Canada and northern Maine, coexisting dwarf and normal-sized 
lake whitefish (Coregonus clupeaformis) represent independent gene pools, albeit 
displaying only a small allozyme distance (Kirkpatrick and Selander 1979). If 
sympatric speciation was involved, then populations of dwarf and normal- 
sized fish in the Allagash should be one another's closest relatives. Contrary to 
this expectation, mtDNA findings indicated that the Allagash populations stem 
from a secondary overlap of two monophyletic groups that probably evolved 
in separate refugia during the most recent Pleistocene glaciation (Bernatchez — ^/ 
and Dodson 1990, 1991, 1994). Western populations outside the Allagash basin 
belong to one mtDNA clade, eastern populations to another, and only in the 
Allagash basin do these clades overlap and appear alternately fixed in dwarf 
and normal whitefish. However, subsequent genetic analyses of dimorphic 
whitefishes (dwarf versus normal and limnetic versus benthic) at additional 
Canadian locales (Bernatchez et al. 1996) revealed instances of both "sympatric 
divergence and multiple allopatric divergence /secondary contact events on a 
small geographic scale" (Pigeon et al. 1997). 

The most famous fish flocks occur in several African Rift Valley lakes. 
Some of the genetic findings for these flocks have been almost as stunning as 
the biological radiations themselves, which can involve hundreds of now-sym- 
patric cichlid species. In particular, early comparisons of mtDNA sequences, in- 
cluding a normally highly variable portion of the control region, revealed little 
phylogenetic differentiation among morphologically diverse representatives of 
Lake Victoria's flock of more than 500 species, but a large genetic distinction 
(more than 50 assayed base substitutions) from cichlids in nearby Lake Malawi 
(Meyer et al. 1990). Results indicated a recent monophyletic origin for many or 
most Lake Victoria cichlids, conflicting with a traditional notion that the taxo- 
nomic complex is "a super-flock comprised of several lineages whose members 
cut across the boundaries imposed by the present-day lake shores" (P. H. 
Greenwood 1980). Nag] et al. (2000) expanded the mtDNA analyses and con- 
cluded that more than one founding cichlid entered Lake Victoria. Verheyen et 
al. (2003) extended the mtDNA analyses yet again, and concluded that two 
seeding lineages colonized Lake Victoria from Lake Kivu to the west. Based on 
analyses of nuclear AFLP markers, Seehausen et al. (2003) suggested that the 
broader species flock of which Lake Victoria cichlids are a part may not be 
strictly monophyletic and, at the outset, might have entailed hybrid swarms. 

The Lake Victoria basin was nearly dry about 15,000 years ago, so one 
long-discussed possibility was that explosive sympatric speciation post-dated 
this desiccation event (Fryer 1997, 2001; Johnson et al. 1996; Seehausen 2002). 
However, the analysis by Verheyen et al. (2003) indicates that the major lineage 
diversification took place about 100,000 years ago. Regardless of the exact 
dates, the fact remains that speciation rates in the Lake Victoria assemblage 
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have been dramatically high. Lakes Malawi and Tanganyika also house large 
cichlid flocks, but these flocks are phylogenetically more diverse (Danley and 
Kocher 2001; Rüber et al. 1999). These other Rift Valley lakes may have sup- 
ported rapid speciations as well, albeit longer ago (McCune 1997). 

The genetic and geological evidence make it tempting to conclude that at 
least some of the prolific speciation in Africa's Rift Valley lakes has been sym- 
patric. However, micro-allopatric speciation is hard to eliminate. Small bodies 
of water inside a lake basin might have retained isolated fish populations dur- 
ing dry periods, or multiple closely related progenitors might have invaded a 
re-formed lake from outside refugia. To address this question more critically, 
researchers have also examined smaller cichlid flocks in volcanic crater lakes. 
These lakes are conical, bowl-like structures with steep walls, so they probably 
have always lacked internal physical barriers to fish dispersal. Yet, genetic 
analyses have shown that mini-flocks of endemic cichlids in two West African 
crater lakes (Lake Barombi Mbo, with eleven species, and Lake Bermin, with 
nine) are each of recent monophyletic origin, a finding consistent with sym- 
patric speciation (Schliewen et al. 1994). For other cichlid mini-flocks in crater 
lakes in Central America, phenotypic and mtDNA analyses indicate that sexu- 
al selection has contributed there to assortative mating and sympatric specia- 
tion (A. B. Wilson et al. 2000). 


ECOLOGICAL SPECIATION. Sympatric and micro-allopatric speciation will 
probably remain difficult to dichotomize sharply, but an emerging perspective 
conceptualizes these processes as “ecological speciation” (Funk et al. 2002; Orr 
and Smith 1998; Schluter 1996b; Streelman and Danley 2003; Via 2002) or 
“adaptive speciation” (Dieckmann et al. 2001). This approach is illustrated by 
multifaceted research on a northern high-latitude fish, the threespine stickle- 
back (Gasterosteus aculeatus) (Bell and Foster 1994). In this species, related eco- 
types that differ in morphology and behavior (skeletal armor, feeding adapta- 
tions, and benthic versus limnetic lifestyle) often co-occur in particular bodies 
of water. Molecular markers interpreted in conjunction with experimental 
studies have shown that multiple unlinked genes underlie the ecotypic differ- 
ences (Peichel et al. 2001); that strong assortative mating between ecotypes in 
various locations has arisen rapidly and repeatedly, both in sympatry and in al- 
lopatry (Rundle et al. 2000; Thompson et al. 1997); that hybrids are often viable 
and fertile; and that distinct ecotypes nonetheless often persist within a lake, 
their differences reinforced by ecological character displacement. In general, 
these studies suggest that divergent ecological selection pressures can play 
huge roles in promoting rapid adaptive radiations and local “speciations” 
(Schluter 2001). 

Several authors have suggested that sexual selection, by driving changes 
in mate recognition, is at least as potent a force as natural selection in promot- 
ing ecological speciation (Panhuis et al. 2001; West-Eberhard 1983). By defini- 
tion, sexual selection operates on features associated with mating success, so 
any phenotypic divergences driven by this phenomenon (within or among ge- 
ographic populations) are also prime candidates for becoming prezygotic RIBs 
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(Questiau 1999). Consistent with this notion, speciation rates have been found 
to be correlated with apparent intensities of sexual selection in birds (Barra- 
clough et al. 1995; Mitra et al. 1996; Maller and Cuervo 1998). Molecular mark- 
ers have helped in analyzing several other taxonomic groups, such as Rift Val- 
ley cichlids (described above) and columbine plants with diverse floral spurs 
(Hodges and Arnold 1995), in which rapid speciation appears to have been as- 
sociated with strong sexual selection on “key innovations” that promoted as- 
sortative mating and speciation. 

Sympatric speciation is easiest to envision when assortative mating and 
disruptive selection operate on the same phenotypic characters (Doebeli 1996; 
Kawecki 1996, 1997; Kondrashov 1986). Data from molecular markers have as- 
sisted in addressing one such case involving male-pregnant seahorses (Jones et 
al. 2003). Microsatellite analyses of genetic parentage demonstrated that sea- 
horses mate assortatively by body size in nature. Using a quantitative genetic 
model, this empirical mating preference was shown to be strong enough, when 
coupled with even modest disruptive natural sclection, to produce rapid evo- 
lutionary divergence for body size and thereby curtail gene flow between pop- 
ulations of large and small fish in sympatry. This finding is potentially relevant 
to speciation modes in these fishes because several instances are known from 
around the world in which closely related species of dwarf and normal-sized 
seahorses co-occur. 

In many of the host-switching insects, fish species flocks, and other sym- 
patric species assemblages in which the RIBs are primarily prezygotic, repro- 
ductive isolation may be rather fragile (i.e., easily lost). Accordingly, these eco- 
typic species may be evolutionarily ephemeral not only due to routine 
extinction processes, but also because the lineages may re-amalgamate via in- 
trogressive hybridization if the ecological or behavioral barriers dissolve, for 
whatever reason. Thus, even if rapid ecological speciation proves to be a com- 
mon phenomenon in contemporary studies of various organisms, its broader 
significance in generating new lineages that regularly withstand the longer 
tests of evolutionary time would remain debatable. In any event, the possibili- 
ty that ecological or behavioral barriers can dissipate, causing gene pools of 
ecotypic species to merge through hybridization, has become of practical con- 
servation concern for the cichlid fishes in Lake Victoria. There, human-caused 
eutrophication of the lake’s otherwise clear waters threatens to compromise the 
visual mate-choice cues that otherwise help to maintain each species’ current 
genetic integrity (Seehausen et al. 1997). 


What are the temporal durations of speciation processes? 


The instances highlighted in previous sections and in Box 7.3 demonstrate un- 
equivocally that speciation can and sometimes does occur on short ecological 
time scales (within a matter of decades or less, at the extreme). However, such 
examples might be highly misrepresentative of speciation processes in general, 
perhaps having attracted special research attention precisely because they are 
unusual, and also because they offer exceptional experimental or observation- 
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al tractability. Almost certainly, most speciations are associated with and pro- 
moted by geographic population separations (e.g., Coyne and Price 2000), and 
as such should be interpreted as extended temporal processes rather than as 
point events in evolution. What are the approximate durations of geographic 
speciation in absolute time? 

Avise and Walker (1998) introduced a way to address this issue by identi- 
fying temporal ceilings and floors on speciation durations, using molecular ev- 
idence and the following logic. The temporal ceilings are estimates of separa- 
tion times between extant sister species, which place an upper bound on how 
long ago genetic separation leading to speciation might have been initiated. 
The temporal floors are estimates of separation times between recognizable ge- 
ographic clades or molecular “phylogroups” at the intraspecific level, which 
indicate minimum evolutionary times that have transpired without speciation 
having gone to apparent completion (at least according to current taxonomic 
assignments). Thus, the approximate mean duration of geographic speciation 
must lie somewhere (perhaps at the midpoint?) between these temporal floors 
and ceilings. The sidereal dates of these floors and ceilings are provisionally es- 
timated using molecular clocks as applied to empirical genetic distances ob- 
served between relevant extant taxa. 

The first application of this methodology involved birds. Klicka and Zink 
(1997) already had compiled genetic distances (based on mtDNA comparisons) 
between extant sister species of North American songbirds, from which they 
concluded that most of these speciations had been initiated in the Pliocene or 
early Pleistocene, rather than in the middle to last Pleistocene, as had been con- 
ventional wisdom (e.g., Brodkorb 1971; Rand 1948). Likewise, Avise and Walk- 
er (1998) then compiled a summary of the mtDNA literature on avian intraspe- 
cific phylogroups, most of which (76%) dated by molecular evidence to various 
separation times during the Pleistocene. Overall, the midpoint between in- 
ferred separation dates for sister species and those for intraspecific phy- 
logroups suggested that the mean duration of a typical avian speciation might 
be approximately 2 million years. Avise et al. (1998) then repeated this entire 
exercise for other major vertebrate assemblages (mammals, fishes, and reptiles 
and amphibians). The composite evidence again suggested that vertebrate spe- 
ciation durations were often on the order of 1-3 million years. 

Johnson and Cicero (2004) later reexamined these estimates for birds, us- 
ing additional molecular data for more taxa as well as several taxonomic re- 
alignments that had accumulated in the interim. In particular, some of the pre- 
sumptive "sister species" in the Klicka and Zink (1997) study had proved not 
to be closest relatives when more taxa were examined, and some of the "in- 
traspecific phylogroups" in the Avise and Walker (1998) compilation had been 
raised to full species status. These revisions, taken at face value and in con- 
junction with the newer data, tended to translate into accordingly shorter esti- 
mates of avian speciation durations, which might typically thus last "only" 
hundreds of thousands of years. Results again show that species concepts and 
taxonomic protocols can impinge considerably on perceptions of "speciation" 
processes. 
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Regardless of which temporal estimates are most valid, it remains clear 
from molecular evidence that allopatric speciation, which undoubtedly pre- 
dominates in most vertebrates and many other animal groups as well, is a 
rather slow process. Thus, the temporal durations of classic geographic specia- 
tion typically extend far beyond the time scales of various sympatric and eco- 
logical speciation events. 


How prevalent is co-speciation? 


Parasites (or other symbionts) may sometimes be tied so closely to specific host 
species that any speciation events in the host taxa would simultaneously confer 
reproductive isolation upon the symbionts they sponsor. When repeated time 
and again during evolution, this “co-speciation” phenomenon should result in 
near-perfect topological matches between the phylogenies of host taxa and 
their associates. Conversely, if the utilizers occasionally or frequently shift 
among host species (as occurs, for example, in some of the phytophagous in- 
sects described earlier), then the phylogenies of hosts and their associates could 
be conspicuously uncoupled, the host shifts in effect producing reticulations of 
guest lineages across host lineages (e.g., Baverstock et al. 1985; Page 1993, 
2003). Thus, there has been considerable empirical interest in jointly assessing 
host and guest phylogenies using molecular markers. 

First, however, a cautionary note: Lack of congruence between the phylo- 
genies of coevolving species could also result from asynchronous speciations 
and lineage sorting processes (Charleston 1998; Page 1994), such that surviving 
lineages in the parasite or symbiont trace to phylogenetic splits either predat- 
ing or postdating nodes in the host phylogeny (Figure 7.6). Patterns of lineage 
sorting in a guest phylogeny embedded within a host phylogeny bear some 
analogy to patterns that can distinguish a gene tree from a species tree (see 
Chapter 4) and are also somewhat reminiscent of distinctions between ortholo- 
gy and paralogy in phylogenetic analyses (Fitch 1970; see Chapter 1). 

Remarkable examples of coincident phylogenies are provided by aphids 
(Aphididae) and their bacterial symbionts in the genus Buchnera (Clark et al. 
2000; Moran et al. 1993, 1998; Tamas et al. 2002). These insects feed on plant 
sap, but depend upon Buchnera bacteria living within their cells for nutrients 
not present in their phloem diet. For 100 million years or more, aphids and 
their bacterial endosymbionts probably have been tightly interdependent mu- 
tualists, so perhaps it is not too surprising that their phylogenies (estimated 
from molecular data) proved to be almost perfectly coincident (Figure 7.7). This 
example extends further: Like fleas on fleas, tiny plasmids are symbiotically as- 
sociated with the Buchnera bacteria, and molecular phylogenies of assayed 
plasmids have proved to be congruent with those of their hosts (Funk et al. 
2000). These impressive juxtapositions of multiple symbiont phylogenies 
might best be described as coevolution rather than co-speciation, however, be- 
cause neither the bacteria nor their plasmids (unlike their aphid hosts) are sex- 
ual reproducers in the normal sense to which biological species concepts typi- 
cally apply. 
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(A) 
Co-speciation of 
host and parasite 
(B) 
Asynchrony of host/ 
parasite speciations 
and lineage sorting 
(C) 






Co-speciation, with 
occasional host switching 


Figure 7.6 Possible relationships between the species phylogeny of a parasite and 
that of its host taxon under models of (A) co-speciation, wherein host and parasite 
phylogenies are perfectly concordant; (B) asynchronous speciation, wherein non-con- 
cordant lineage sorting characterizes host and parasite lineages; and (C) host switching 
by the parasite. The parasite phylogeny is depicted by the inner branches; black dots 
indicate speciations. The thick outer branches depict the host phylgeny. 


As judged by similar molecular phylogenetic analyses (of mitochondrial 
genes, nuclear genes, or both) in numerous other taxonomic groups, perfect con- 
gruence between the cladogenetic histories of interacting species appears to be 
the exception, not the rule. For example, the phylogeny of Schistosoma trematodes 
bears little resemblance to the phylogeny of the snails they parasitize (Morgan et 
al. 2002), and the same can be said for Puccinia rust fungi and their plant hosts 
(Roy 2001), Wolbachia bacteria and their insect hosts (Kittayapong et al. 2003; 
Shoemaker et al. 2002), herbivorous Ophraella beetles and the plants they utilize 
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Figure 7.7 Near-perfect phylogenetic congruence between aphids and their en- 
dosymbiotic Buchnera bacteria. The diagram on the left is at the taxonomic family lev- 
el (Aphididae), and that on the right is for a subset of aphid hosts mostly in the genus 
Uroleucon. (After Moran 2001.) 


(Funk et al. 1995), mushroom-eating Drosophila and their nematode parasites 
(Perlman et al. 2003), and brood parasitic finches and their avian host species 
(Klein and Payne 1998). Another common outcome, illustrated by various avian 
taxa and their lice (Johnson et al. 2003; Page et al. 1998; Paterson et al. 2000) and 
by marine sponges and their bacterial symbionts (Erpenbeck et al. 2002), is partial 
correspondence between the phylogenies of host taxa and their associates. Such 
instances indicate a tendency for co-speciation, but with the patterns confounded 
by occasional host shifts or other complicating factors (see above) that have par- 
tially uncoupled historical coevolution between the interacting taxa. 

Another kind of species interaction that has been hypothesized to promote 
congruent phylogenies (as well as sympatric speciation) is “social parasitism,” as 
illustrated by Polistes wasps. In these colonial hymenopterans, a social parasite 
uses workers of another social insect species (usually a close evolutionary rela- 
tive) to rear its own progeny. However, phylogenetic analyses based on al- 
lozymes and on rDNA sequences from the mitochondrial genome revealed that 
all social parasites in the genus are monophyletic and recently evolved relative to 
their host species (Carpenter et al. 1993; Choudhary et al. 1994). These results are 
more consistent with the idea that speciations in these social parasites occurred 
allopatrically and independently of the evolution of social parasitism. 
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Can morphologically cryptic species be diagnosed? 


Molecular markers can be of great utility in diagnosing closely related species, 
even where morphological or other traditional markers fail or are ambiguous. In 
an illustrative early example, several sibling species of Drosophila proved readily 
separable (see Figure 7.2) with a battery of allozyme markers despite their near- 
identity in morphological appearance (Ayala and Powell 1972b; Ayala et al. 1970; 
Hubby and Throckmorton 1968). Such molecular markers can find application in 
diagnosing the species composition of field collections where this is otherwise 
uncertain. For example, Pascual et al. (1997) used a battery of allozyme, mtDNA, 
and RAPD markers to distinguish three sibling species of Drosophila (subobscura, 
athabasca, and azteca) and thereby illuminate the geographic distributions of these 
flies following their colonization of North America's west coast. 

Sibling species of Trachyphloeus weevils (Jermiin et al. 1991) and Chthamalus 
barnacles (Hedgecock 1979) likewise have been readily distinguished by al- 
lozymes, as have various Alpheus and Penaeus shrimp by mtDNA sequences 
(Mathews et al. 2002; Palumbi and Benzie 1991). A polychaete worm (Capitella 
capitata) used as an indicator of marine pollution and environmental distur- 
bance was once thought to be a single cosmopolitan species, but allozyme 
analyses indicated the presence of at least six sibling species (Grassle and 
Grassle 1976). In corals assigned to Montastrea annularis, some morphological 
variation suspected to be due to phenotypic plasticity proved instead, upon al- 
lozyme analysis, to involve distinct sympatric species (Knowlton et al. 1992; 
Lopez et al. 1999). In sea anemones (genus Anthopleura), allozyme analyses 
confirmed the presence in sympatry of two closely related species that differ in 
having solitary versus clonal reproductive lifestyles (McFadden et al. 1997). In 
general, allozymes and other molecular markers have identified numerous sib- 
ling species in the sea (Knowlton 1993) and elsewhere (e.g., Gómez et al. 2002; 
Jarman and Elliott 2000; McGovern and Hellberg 2003; Trewick 2000). Such 
studies can inform ecology and management as well as systematics. For exam- 
ple, molecular genetic analyses of sibling species of Carcinus crabs have helped 
to identify cryptic marine invasions and the geographic sources of colonizing 
taxa (Geller et al. 1997). Molecular genetic identifications of scale insects 
(Quadraspidiotus sp.) enabled researchers to determine which particular pest 
species had been attracted to artificial pheromone traps (Frey and Frey 1995). 

Various classes of molecular markers can likewise be used to distinguish 
closely related plant species (e.g., Whitty et al. 1994). An interesting but atypi- 
cal example involved use of species-specific “transposon signatures” to distin- 
guish species and subspecies within the Zea/Tripsacum complex of maize-like 
plants (Purugganan and Wessler 1995). Transposable elements (TEs; see Box 
1.3) are abundant components of prokaryotic and eukaryotic genomes, in 
which they reside at specific positions often characteristic of particular popula- 
tions or species. The approach employed by Purugganan and Wessler was to 
amplify particular DNA regions containing TEs and then digest them with re- 
striction enzymes. The resulting digestion profiles provided diagnostic molec- 
ular signatures for different Zea taxa. 
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Zooxanthellae are unicellular endosymbiotic algae that live within the tis- 
sues of diverse marine invertebrates such as sea anemones, corals, and gor- 
gonians. Due to a paucity of distinguishing morphological characters, most 
zooxanthellae had been placed in the genus Symbiodinium, but the numbers of 
species, their evolutionary relationships, and their particular host associations 
were poorly known. Rowan and Powers (1991) assayed nuclear rDNAs from 
zooxanthellae isolated from 22 invertebrate taxa and identified several geneti- 
cally distinct forms, some of which resided in congeneric hosts. Conversely, 
some zooxanthellae that were genetically similar came from divergent hosts of 
ordinal or greater taxonomic separation. Results suggested that many cryptic 
species of zooxanthellae exist and that symbioses can arise by symbiont shifts 
among even distant host taxa. The conclusion that Symbiodinium is a highly di- 
verse and ancient evolutionary assemblage was further supported by the find- 
ing that the collective genetic diversity of its rDNA sequences rivals that be- 
tween taxonomic orders of non-symbiotic dinoflagellates (Rowan and Powers 
1992). Furthermore, this diversity appears to be organized into about half a 
dozen highly divergent evolutionary clades, according to evidence from 
cpDNA and nuclear rDNA sequences (LaJeunesse 2001; Pochon et al. 2001; 
Santos et al. 2002). 

Fig-pollinating wasps in the family Agonidae have refined mutualistic in- 
teractions with Ficus trees: The fig trees depend on female wasps for pollina- 
tion, and the wasps depend on fig inflorescences for oviposition sites and nurs- 
eries for their young (Janzen 1979; Wiebes 1979). These tightly integrated 
fig-wasp symbioses have long provided model systems for research into the 
ecology and evolution of animal-plant mutualisms, with a common assump- 
tion being that one species of pollinator wasp specializes on each species of 
host fig. However, recent molecular analyses have cast doubt on this latter as- 
sumption by documenting substantial mtDNA sequence differences among 
wasps within a majority of fig species surveyed, thus strongly suggesting the 
presence of many formerly cryptic wasp species (Molbo et al. 2003). By under- 
mining the prevalent notion of a strict one-to-one specificity between figs and 
their pollinators, these genetic findings will necessitate various reinterpreta- 
tions of conventional wisdom about this mutualistic complex (Machado et al. 
2001; see also Weiblen and Bush 2002). 

Among the vertebrates, many problematic species have been distin- 
guished using molecular characters (e.g., Belfiore et al. 2003). Morphologically 
cryptic fish species, such as several closely related species of Thunnus tunas 
(Bartlett and Davidson 1991), can be distinguished by allozyme or mtDNA 
markers (see early reviews in Powers 1991; Shaklee 1983; Shaklee et al. 1982). In 
Gastrotheca frogs, immunological and protein electrophoretic evidence revealed 
that at least six species in two different groups formerly had masqueraded un- 
der the name G. riobambae (Duellman and Hillis 1987; Scanlan et al. 1980). Re- 
markable examples of morphological stasis despite extensive speciation are 
provided by the lungless plethodontid salamanders, in which multiple fixed 
allozyme differences proved to be common among populations formerly 
thought to be conspecific (Larson 1984, 1989). For example, the slimy salaman- 
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der of the eastern United States had been considered a single species (Plethodon 
glutinosus), but dramatic and often sympatric differences in protein markers 
(Highton 1984; Maha et al. 1989) revealed the presence of at least 16 groups that 
probably warrant recognition as species or semispecies (Highton et al. 1989). 
Among avian taxa, sibling species in the flycatcher genera Empidonax and Con- 
topus are difficult to distinguish morphologically, but species and clades 
proved to exhibit diagnostic molecular markers (Johnson and Cicero 2002; 
Zink and Johnson 1984). In general, birds are among the best known of organ- 
isms at the species level, but in the early 1990s a shrike named Lanarius liberatus 
became the first "new" avian species described solely on the basis of molecular 
data (Hughes 1992; E. F. G. Smith et al. 1991). 

In many taxonomic groups, adults of different species are distinguishable 
morphologically, whereas their juveniles or larvae are not. By comparing mo- 
lecular characters in unknown larvae against known adults, species assign- 
ments can often be made. In some pioneering examples of this approach, re- 
searchers used protein electrophoresis or DNA restriction mapping to make 
species assignments for various larval marine organisms ranging from oysters 
(Hu et al. 1992) to fishes (Graves et al. 1988, 1989; Morgan 1975; Sidell et al. 
1978; Smith and Benson 1980; Smith and Crossland 1977). The introduction of 
PCR-based methods dramatically improved scientists' abilities to identify 
small larval forms or eggs, as, for example, when Silberman and Walsh (1992) 
amplified and digested nuclear 285 rRNA genes from phyllosome larvae of 
three species of spiny lobsters (Panulirus argus, P. guttatus, and P. laevicauda) and 
thereby revealed species-diagnostic DNA banding patterns. In another inter- 
esting early application of PCR-based larval identification, Olson et al. (1991) 
amplified and sequenced 16S rDNAs from sea cucumbers (Echinodermata) 
and thereby assigned a collection of bright red pentacula larvae to Cucumaria 
frondosa. This assignment came as a surprise—based on coloration and mor- 
phological considerations, these larvae had been thought to belong to a dis- 
tantly related sea cucumber, Psolus fabricii. In recent years, PCR-assisted DNA 
sequencing has become almost routine for taxonomic identifications of marine 
larvae (Coffroth and Mulawka 1995) and in systematic assessments of small 
sea creatures such as copepods (Bucklin et al. 1999). 

PCR-based methods and additional molecular techniques have also fur- 
ther revolutionized scientists' abilities to discriminate among species of tiny or 
microbial forms, including mites (Fenton et al. 1995), soil nematodes (Floyd et 
al. 2002), mosses (A. J. Shaw 2000), fungi (Bidochka et al. 1997; Fell et al. 1992), 
foraminiferans (Bauch et al. 2003), viruses (e.g., Allander et al. 2001), and bacte- 
ria (Laguerre et al. 1994). For example, Wimpee et al. (1991) employed two 
highly conserved regions of the luxÁ gene as PCR primers to develop species- 
specific probes that can identify four major groups of marine luminous bacteria 
from field isolates. 

Schmidt et al. (1991) used molecular assays of bulk genomic DNA to ad- 
dress the species composition of picoplankton collections from the central Pa- 
cific Ocean. They cloned mixed populations of DNA into phage, screened these 
clones for the presence of 165 rRNA genes, and sequenced the isolates. When 
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compared against an established information base of rRNA gene sequences, 
these data allowed identification of 15 distinctive bacterial sequences that 
could be related to cyanobacteria and proteobacteria. Similar molecular ap- 
proaches have been employed to characterize the approximate phylogenetic 
positions of previously unknown microbial taxa from a wide variety of soils 
and aqueous environments (Angert et al. 1998; Blank et al. 2002; Dawson and 
Pace 2002; de Souza et al. 2001; Giovannoni et al. 1990; Weller and Ward 1989; 
Weller et al. 1991). Although such molecular analyses are still quite some way 
from providing complete species-level descriptions in the prokaryotic world 
(Curtis et al. 2002), they can be applied to all bacterial forms (whether they can 
be artificially cultured or not), and they are revolutionizing scientific capabili- 
ties to delve into microbial ecology and diversity (Pace 1997; Seidler and 
Fredrickson 1995; Wilson 2003). 

In industrial and epidemiological areas also, molecular markers routinely 
play many key roles in distinguishing otherwise cryptic lineages. For example, 
Salama et al. (1991) employed subspecies-specific rRNA gene probes to identi- 
fy Lactococcus lactis cremoris, a bacterium whose few available strains are relied 
upon by the dairy industry for the manufacture of cheddar cheese that is free 
of fermented and fruity flavors. Similarly, Regnery et al. (1991) employed am- 
plified DNA sequences from the citrate synthase and 190-kDa antigen genes as 
substrates for molecular assays that proved to distinguish various rickettsial 
species causing spotted fever. In another application with epidemiological rel- 
evance, PCR-based assays were used to identify sequences from the introduced 
West Nile virus in wild birds and mosquitoes from the northeastern United 
States, thereby helping to assess the origins and spread of the West Nile disease 
outbreak in North America (Anderson et al. 1999). 

As transmitting agents for many human diseases, mosquitoes began to re- 
ceive molecular genetic attention more than two decades ago. Among the early 
efforts was the identification by Miles (1978) and Finnerty and Collins (1988) of 
diagnostic allozyme and RFLP markers for mosquitoes in the Anopheles gambi- 
ae complex, which includes some of the primary vectors for African malaria. In 
the United States, another Anopheles mosquito (formerly considered A. quadri- 
maculatus) proved upon molecular and cytological examination to consist of at 
least four cryptic species. Numerous molecular markers, including allozymes 
(Narang et al. 1989a,b,c), (DNAs (Mitchell et al. 1993), and mtDNA (Kim and 
Narang 1990), showed concordant genetic partitions that also agreed with re- 
productive boundaries as revealed in experimental crosses. Indeed, many mo- 
lecular characters were diagnostic, enabling construction of dichotomous keys 
to species identification (Table 7.4). Other pathbreaking molecular studies on 
mosquitoes discriminated morphologically cryptic and sometimes sympatric 
forms of Aedes, including behavioral types in the A. aegypti complex, the pri- 
mary vector to humans of dengue fever and yellow fever viruses (Munster- 
mann 1988; Powell et al. 1980; Tabachnick and Powell 1978; Tabachnick et al. 
1979). Examples of more recent developments in molecular genetic diagnosis 
and systematics of various mosquito taxa can be found in Krzywinski et al. 
(2001), Lehmann et al. (2003), Miller et al. (1996), Munstermann (1995), and 
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TABLE 7.4 . Diagnostic allozyme loci and dichotomous biochemical key to 





four sibling species in the Anopheles quadrimaculatus complex of 
mosquito sibling species 


Diagnostic loci? for the indicated species pairs 





A:B A:C A:D B:C B:D CD 
Idh-1 Acon-1 Acon-1 Acon-1 Acon-1 Got-1 
Idh-2 Idh-2 Idh-2 Idh-1 ldh-1 Had-1 
Est-2 Had-1 Got-1 Had-1 Got-1 Had-3 
Est-5 Had-3 Got-2 Had-3 Got-2 Pep-4 
Est-7 Pep-2 Pep-2 Got-2 Pep-2 Pgi-1 
Had-1 Got-2 Pep-4 Pep-2 Pep-4 Me 
6Pgd-1 Poi-1 Me-1 Pgi-1 Me-1 Est-2 
Est-2 Mpi-1 Est-4 Est-2 Mpi-1 
Est-6 Est-5 Est-7 
Mpi-1 Est-6 Mpi-1 
6-Pgd-1 Est-7 
Xdh-3 Mpi-1 
Ao-1 Xdh-3 
Biochemical key® 
T. Mpi-1 slow (62 allele, rarely with 52 as heterozygote). ............... Species D 
Mpi-1 faster (78 or greater)... Goto2 
2. Idh-1 slow (86) and Idh-2 fast (162)................. eese. Species B 
Idh-1 faster (2100, sometimes with 86 as heterozygote); 
Idh-2 fast or slower (100, 132, 162)......::ccccessesedeeeeeseneesatenee Go to3 
3. Had-3 slow (45); Pgi slow (95) .. Species C 
Had-3 faster (100, sometimes with 45 as heterozygote); | A 
Pgi faster (100, rarely with 95 as heterozygote)... ^ Species A 


Source: After Narang et al. 1989b. 





* Those providing correct identification with probability >99%. 


"In this key (one of many that could be generated), numbers indicate electromorph gel mobilities 


in an unknown sample relative to the electromorph in a standard strain. 


Walton et al. (2001). The entire nuclear genome of Anopheles gambiae was re- 


cently published (Holt et al. 2002). 


Hebert et al. (2003; see also Tautz et al. 2003) raised the prospect that rou- 
tine biodiversity assessments of the future may involve “DNA barcodes” more 
than conventional taxonomic appraisals based on morphology. They suggested 
that traditional systematic expertise is collapsing rapidly for many taxonomic 
groups, and that “the sole prospect for a sustainable identification capability 
lies in the construction of systems that employ DNA sequences as taxon bar- 


codes.” This Orwellian specter is perhaps more sad than exciting. 
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Should a phylogenetic species concept replace the BSC? 


Throughout the twentieth century and continuing today, the biological species 
concept (BSC) has been the major theoretical framework orienting research on 
the origins of biological diversity. However, a serious recent challenge to the 
BSC has come from some systematists, who argue that it lacks a sufficient phy- 
logenetic perspective and, hence, provides an inappropriate guide to the origins 
and products of evolutionary diversification (de Queiroz and Donoghue 1988; 
Donoghue 1985; Eldredge and Cracraft 1980; Mishler and Donoghue 1982; Nel- 
son and Platnick 1981; see reviews in Hull 1997; Wheeler and Meier 2000). Mod- 
ern critics of the BSC argue that "reproductive isolation should not be part of 
species concepts” (McKitrick and Zink 1988) and that “as a working concept, 
the biological species concept is worse than merely unhelpful and non-opera- 
tional—it can be misleading” (Frost and Hillis 1990). These criticisms have led to 
a call for replacement of the BSC with a phylogenetic species concept, or PSC 
(see Box 7.1), under which a species is defined as a monophyletic group com- 
posed of “the smallest diagnosable cluster of individual organisms within 
which there is a parental pattern of ancestry and descent” (Cracraft 1983). 

One key motivation for this suggestion is that biological speciations, as de- 
scribed above, often result initially in paraphyletic taxa (an anathema to 
cladists) at the species level. In principle, this “problem” could be remedied un- 
der the PSC by elevating all diagnosable populations to full species rank (Om- 
land et al. 1999; Voelker 1999), or, alternatively, by synonymizing paraphyletic 
taxa and the subset taxa nested therein. From the perspective of the BSC, how- 
ever, neither of these alternatives is desirable, in part because they would either 
ignore the genetic and reproductive distinctness of the nested lineage or neg- 
lect what might be high gene flow within the paraphyletic lineage (Funk and 
Omland 2003; Olmstead 1995; Sosef 1997; Wiens and Penkrot 2002). Thus, the 
PSC can also be justifiably accused of being biologically unhelpful, if not mis- 
leading (Johnson et al. 1999), even in the strict genealogical context it otherwise 
intends to inform (see also Wiens 1999). 

Because molecular data provide unprecedented power for phylogeny esti- 
mation, it might be supposed that molecular evolutionists would be among the 
strongest advocates for the PSC, but this has not necessarily been the case 
(Avise 2000a,b; Avise and Wollenberg 1997). One serious difficulty with exist- 
ing PSC proposals concerns the nature of the evidence required to justifiably 
diagnose a monophyletic group warranting species recognition. Molecular 
technologies have made it abundantly clear that multitudinous derived traits 
often can be employed to subdivide named species into diagnosable subunits 
(see Chapter 6). Indeed, most individuals and family units within sexually re- 
producing species can be distinguished from one another with high-resolution 
molecular assays. If each individual or kinship unit is genetically unique, then 
to group multiple individuals into phylogenetic “species” would require that 
distinctions below some arbitrary threshold be ignored (unless each specimen 
is to be considered a unique species). The evolutionary significance of any such 
threshold surely must be questionable. For these and other reasons, Avise and 
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Ball (1990) suggested that if the broader framework of the PSC were to con- 
tribute to a significant advance in systematic practice (as they believed that it 
could), a shift from issues of diagnostics to issues of magnitudes and patterns 
of phylogenetic differentiation, and of the historical and reproductive reasons 
for such patterns, would be required. Toward that end, they introduced the no- 
tion of “genealogical concordance” principles that might be employed to com- 
bine desirable elements of both the PSC and the BSC. 

Within any sexual organismal pedigree, allelic phylogenies can differ 
greatly from locus to locus (Ball et al. 1990; Baum and Shaw 1995; Maddison 
1997), if for no other reasons than the Mendelian nature of meiotic segregation 
and syngamy and the inevitable vagaries of lineage sorting within and among 
gene trees. An array of individuals phylogenetically grouped by one locus may 
differ from an array of individuals grouped by another locus, unless some 
overriding evolutionary force has concordantly shaped the phylogenetic struc- 
tures of multiple quasi-independent genes. One such force expected to pro- 
mote genealogical concordance across loci (aspect II; see Chapter 6) is intrinsic 
reproductive isolation (the focal point of the BSC). Through time, due to 
processes of lineage turnover, biological species isolated from one another by 
intrinsic RIBs inevitably tend to evolve toward a status of reciprocal monophy- 
ly in particular gene genealogies. Furthermore, through time, the genealogical 
tracings of independent loci almost inevitably sort in such a way as to partition 
these species concordantly. Thus, various aspects of genealogical concordance 
per se become deciding criteria by which biologically meaningful genetic par- 
titions can be distinguished from partitions that are "trivial" or gene-idiosyn- 
cratic with respect to organismal phylogeny. 

However, for populations that are geographically isolated for sufficient 
lengths of time relative to effective population sizes, genealogical concordance 
across loci also can arise from purely extrinsic barriers to reproduction. As em- 
phasized in Chapter 6, dramatic phylogenetic partitions are routinely observed 
among populations considered conspecific under the BSC. It might be argued 
that such populations also warrant formal taxonomic recognition (albeit not 
necessarily at the species level) on the grounds that they represent significant 
biotic partitions of relevance to such areas as biogeographic reconstruction and 
conservation biology. 

From consideration of these and additional factors, Avise and Ball (1990) 
suggested the following conceptual framework for biological taxonomy, based 
on genealogical concordance principles. The biological and taxonomic catego- 
ry "species" should continue to refer to groups of actually or potentially inter- 
breeding populations isolated by intrinsic RIBs from other such groups. In oth- 
er words, a retention of the basic philosophical framework of the BSC is 
warranted, in no small part because RIBs are a powerful evolutionary force in 
generating significant historical partitions in organismal phylogenies (i.e., in 
generating salient “genotypic clusters"; Mallet 1995). Within such units, “sub- 
species" warranting formal recognition could then be conceptualized as 
groups of actually or potentially interbreeding populations (normally mostly 
allopatric) that are genealogically highly distinctive from, but reproductively 
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compatible with, other such groups. Importantly, the empirical evidence for ge- 
nealogical distinction must come, in principle, from concordant genetic parti- 
tions across multiple, independent, genetically based molecular (or phenotyp- 
ic; Wilson and Brown 1953) traits. This phylogenetic approach to taxonomy 
near and below the species level represents a novel compromise between the 
BSC and the PSC, and is a clear conceptual outgrowth from molecular genetic 
and coalescence-based perspectives on microevolutionary processes. 


Hybridization and Introgression 


The term "hybridization" is as difficult to define as is speciation, and for simi- 
lar reasons. In the early literature of systematics, a "hybrid" was deemed to be 
an offspring resulting from a cross between species, whereas the term "inter- 
grade" was reserved for any product of a cross between recognizable conspe- 
cific populations or subspecies. But as we have seen, this distinction can be 
rather subjective, so "hybridization" is now usually employed in a broad sense 
to include crosses between genetically differentiated forms regardless of their 
current taxonomic status. "Introgression" refers to gene movement between 
species (or sometimes between well-marked genetic populations) mediated by 
hybridization and backcrossing. 


Frequencies and geographic settings of hybridization 


Hybridization and introgression are common phenomena in many plant and 
animal groups. More than 30 years ago, Knobloch (1972) compiled a list of near- 
ly 24,000 reported instances of interspecific or intergeneric plant hybridization 
(despite the availability of detailed studies on only a small fraction of the botan- 
ical world). Introgression is more challenging to assess, but Rieseberg and Wen- 
del (1993) provided a compilation of 155 noteworthy cases of plant introgres- 
sion, many of which include molecular documentation. Hybridization is 
especially common in outcrossing perennials (Ellstrand et al. 1996). Similarly, 
hybridization and introgression have been uncovered in numerous animal taxa 
(Dowling and Secor 1997; Harrison 1993). For example, Schwartz (1981) com- 
piled a list of nearly 4,000 references dealing with natural and artificial hy- 
bridization in fishes, many cases of which have been verified and characterized 
further using molecular markers (Avise 2001c; Campton 1987; Verspoor and 
Hammar 1991). Among the vertebrates, fishes with external fertilization appear 
most prone to hybridization, but the phenomenon is widespread. 

Both the frequency of hybridization and the extent of introgression can vary 
along a continuum from nil to extensive, and molecular markers are invaluable 
for empirically assessing where a given situation falls. Near one extreme, hy- 
bridization may be confined primarily to the production of F, hybrids, which 
may be abundant or rare. For example, analyses based on nuclear and mtDNA 
markers revealed that hybrids between brook trout (Salvelinus fontinalis) and 
bull trout (S. confluentus) in Montana are mostly nearly sterile F, individuals, but 
they are also common (at some locales) and arise from crosses in both directions 
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with respect to sex (Leary et al. 1993). By contrast, hybrids between blue whales 
(Balaenoptera musculus) and fin whales (B. physalus) are rare, but similar molecu- 
lar marker analyses proved that three phenotypically anomalous individuals 
were F, hybrids and that one hybrid female also carried a backcross fetus with a 
blue whale father (Árnason et al. 1991; Spilliaert et al. 1991). At the opposite ex- 
treme, introgressive hybridization can be so extensive that populations merge 
into one panmictic gene pool. This situation is exemplified by hybrid swarms 
between genetically distinct subspecies of bluegill sunfish (Lepomis macrochirus 
macrochirus and L. m. purpurescens) in Georgia and the Carolinas (Avise and 
Smith 1974; Avise et al. 1984b) and between cutthroat trout subspecies (On- 
corhynchus clarki lewisi and O. c. bouvieri) in Montana (Forbes and Allendorf 
1991). In both cases, these taxa normally inhabit different geographic regions, 
but can hybridize extensively when they meet. 

Reports of extensive introgressive hybridization between well-marked taxa 
occasionally appear. One case in point involves two crayfish species in northern 
Wisconsin and Michigan: the native Orconectes propinquus and an introduced 
congener, O. rusticus. Based on cytonuclear molecular analyses, female rusticus 
often mate with male propinquus, producing F, hybrids that mediate extensive in- 
trogression (Perry et al. 2001). One net result has been the near-elimination of ge- 
netically pure O. propinquus in one Wisconsin lake. A similar case in vertebrate 
animals involves spotted bass (Micropterus punctulatus) and smallmouth bass (M. 
dolomieui) in Lake Chatuge in northern Georgia. In the late 1970s, spatted bass 
were introduced into the lake, which formerly was inhabited only by small- 
mouths. Within about 10 years, only a small percentage of genetically pure small- 
mouth bass remained, as judged by species-diagnostic nuclear and mitochondr- 
ial markers (Avise et al. 1997). Furthermore, this demographic shift had been 
accompanied by extensive introgression, such that more than 95% of the remain- 
ing smallmouth bass alleles in Lake Chatuge had become “genetically assimilat- 
ed” into the gene pool of fish with hybrid ancestry. Such an outcome, sometimes 
referred to as “genetic swamping,” can be interpreted as a local genetic extinction 
of a population via hybridization and introgression. Several additional examples 
of this phenomenon are known (see review in Rhymer and Simberloff 1996). 

In many taxonomic groups, organisms separated for long periods of evolu- 
tionary time nonetheless may retain the anatomical and physiological capacity 
for hybrid production. Using micro-complement fixation assays, Wilson and col- 
leagues (1974a, 1977; Prager and Wilson 1975) compared immunological dis- 
tances in numerous pairs of mammal species, bird species, and frog species that 
were known to be capable of generating viable hybrids in captivity or in the 
wild. The genetic distances were then translated into estimates of absolute diver- 
gence times for the species involved, using molecular clocks calibrated specifical- 
ly for each taxonomic group (Figure 7.8). Results indicated that the hybridizable 
frog species had separated from one another, on average, more than 20 million 
years ago, as had the hybridizable birds assayed, whereas mean separation time 
for the hybridizable mammal species was only about 2-3 mya. The dramatically 
faster pace at which mammals had lost the potential for interspecific hybridiza- 
tion was provisionally attributed to a faster pace of chromosomal evolution or a 
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Figure7.8 Evolutionary separation times, as estimated from albumin immunologi- 
cal distances (ID) for more than a hundred pairs of vertebrate species capable of pro- 
ducing viable hybrids. Molecular clocks used to generate these times were calibrated at 
about 1.7 ID units per million years in frogs and mammals (Prager and Wilson 1975) 
and about 0.6 ID units per million years in birds (Prager et al. 1974). (After Prager and 
Wilson 1975; Wilson et al. 1974a.) 


higher evolutionary rate in their regulatory genes (Prager and Wilson 1975; Wil- 
son et al. 1974a,b). Regardless of the explanation, many organisms clearly retain 
the physiological capacity to hybridize over very long periods of evolutionary 
time. How often such potential is realized in nature is another issue, of course, 
and one that can be powerfully addressed using molecular markers. 

Frequently, as in the basses mentioned above, hybridization follows hu- 
man-mediated transplantations (Scribner et al. 20012). In the early 1980s, the 
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pupfish Cyprinodon variegatus was introduced to the Pecos River in Texas, where 
it then hybridized with an endemic species, C. pecosensis. Protein electrophoret- 
ic data revealed that within 5 years, panmictic admixtures of the two pupfishes 
occupied approximately 430 river kilometers, or roughly one-half of the historic 
range of the endemic species (Echelle and Connor 1989; Echelle et al. 1987, 
1997). A similar study of land snails involved Bahamian Cerion casablancae that 
were introduced in 1915 to the range of C. incanum on Bahia Honda Key, Flori- 
da. Introgressive hybridization ensued, and analyses of allozymes and mor- 
phology later in the century revealed that the snails had become panmictic on 
Bahia Honda, that no pure C. casablancae remained, and that there had been a 
30% reduction in frequency of the introduced genome (Woodruff and Gould 
1987). Another notable example of hybridization precipitated by artificial trans- 
plantations involves salmonid fishes in the western United States. There, repeat- 
ed introductions of millions of hatchery-reared rainbow trout to endemic cut- 
throat trout habitats, and of cutthroat trout from one locale to another, were 
followed by extensive genetic introgression that has been thoroughly docu- 
mented using molecular markers (Allendorf and Leary 1988; Busack and Gall 
1981; Gyllensten et al. 1985b; Kanda et al. 2002a; Leary et al. 1984). 

With respect to geography, natural hybridization may occur sporadically 
between broadly sympatric species or be confined to particular contact areas. 
Hybrid zones are regions in which genetically distinct populations meet and 
produce progeny of mixed ancestry (Barton and Hewitt 1989; Harrison 1990), 
and they are often spatially linear (Hewitt 1989) or mosaic (Harrison and Rand 
1989; Rand and Harrison 1989). They can also move through time (Barton and 
Hewitt 1981), sometimes rapidly, as has been documented with the help of mo- 
lecular markers in plants (Martin and Cruzan 1999), butterflies (Dasmahapatra 
et al. 2002), fishes (Childs et al. 1996), birds (Rohwer et al. 2001), and mammals 
(Hafner et al. 1998), among others. Hybrid zones typically represent secondary 
overlaps of formerly allopatric or parapatric (abutting) taxa, and they are often 
evidenced by a general concordance across loci in allelic clines that transect the 
presumed contact zone (e.g., Dessauer 2000). (In theory, however, concordant 
clines could also be generated by intense diversifying selection within a contin- 
uously distributed population; Endler 1977.) Secondary hybrid zones may be 
persistent or ephemeral. Persistent hybrid zones are usually hypothesized to 
register either “bounded hybrid superiority,” wherein hybrids have superior fit- 
ness in areas of presumed ecological transition, or “dynamic equilibrium” (also 
known as genetic “tension”; Barton and Hewitt 1985; Key 1968), wherein the 
hybrid zone is maintained through a balance between continued dispersal of 
parental types into the area and hybrid inferiority (Moore and Buchanan 1985). 

Hybrid zones are marvelous settings in which to apply molecular markers 
for several reasons (Hewitt 1988). First, the populations or species involved are 
genetically differentiated (by definition), such that multiple markers for char- 
acterizing each hybrid gene pool normally can be uncovered. Second, because 
each hybrid zone involves an amalgamation of independently evolved 
genomes, exaggerated effects of intergenomic interactions can be anticipated. 
These effects magnify the impact of such processes as recombination and natu- 
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ral selection, making these evolutionary forces easier to study. Third, various 
sexual asymmetries frequently are involved in hybrid zones, and powerful ap- 
proaches now exist for dissecting these factors by utilizing joint data from cyto- 
plasmic and nuclear markers, as described next (see also Box 7.4). 


Sexual asymmetries in hybrid zones 


The power of molecular markers in dissecting hybridization phenomena can 
be introduced by considering a study in which cytonuclear analyses (i.e., joint 
examination of nuclear and cytoplasmic markers) were applied to a hybrid 
population between Hyla cinerea and H. gratiosa in ponds near Auburn, Alaba- 
ma. These genetically distinct treefrog species are distributed widely and sym- 
patrically throughout the southeastern United States, but judging from mor- 
phological evidence they hybridize at least sporadically, and have done so 
extensively at the Auburn site across severa] decades. One reason for particular 
interest in the Auburn population stems from behavioral observations suggest- 
ing the potential for a sexual bias in the direction of interspecific matings. Dur- 
ing the breeding season, H. gratiosa males call from the water surface, whereas 
H. cinerea males call from perches along the shoreline (Figure 7.9A). In the 
evenings, gravid females of both species approach the ponds from surround- 
ing woods and become amplexed (mated). Thus, one hypothesis is that inter- 
specific matings might primarily involve H. cinerea males with H. gratiosa fe- 
males, rather than the reverse, because H. gratiosa females must "hop a 
gauntlet" of H. cinerea males before reaching conspecific partners. 

Lamb and Avise (1986) employed five species-diagnostic allozyme loci 
plus mtDNA to characterize 305 individuals from this hybrid population. The 
allozyme loci were chosen because they exhibited fixed allelic differences be- 
tween the species, thus allowing provisional assignment of each individual at 
the Auburn site to one of the following six categories: pure cinerea, pure gra- 
tiosa, F, hybrid, progeny from a backcross to cinerea, progeny from a backcross 
to gratiosa, or later-generation hybrid. For example, an F, hybrid should be het- 
erozygous at all marker loci, and a cinerea backcross progeny would probably 
appear heterozygous at some loci and homozygous for cinerea alleles at others. 
[Probabilities of misclassifying an individual can be calculated from basic 
Mendelian considerations, and they are low whenever multiple diagnostic 
markers are used. For example, a true first-generation cinerea backcross proge- 
ny would be mistaken for a pure cinerea with probability k = (0.5)", where n is 
the number of fixed marker loci, so in this case, k = 0.03.] The mtDNA geno- 
types then allowed assignment of the female (and hence male) parent for each 
allozyme-characterized treefrog at the Auburn site. 

For this hybrid population, the molecular data revealed a striking genetic ar- 
chitecture that generally proved consistent with the suspected mating behaviors 
of the parental species (Table 7.5; Figure 7.9B). Thus, all 20 F, hybrids carried gra- 
tiosa-type mtDNA, showing that they had gratiosa mothers. Furthermore, 52 of 53 
individuals identified as backcross progeny to gratiosa possessed gratiosa-type 
mtDNA (as predicted, because their mothers were either F, hybrids or pure gra- 
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gratiosa cinerea 
Fi Fi F; F, 
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Backcrosses to gratiosa Backcrosses to cinerea 


Figure7.9 Biological setting of a hybrid Hyla population. (A) Diagrammatic aerial 
view of the Auburn pond, showing the typical spatial positions of male and female 
frogs before the mating process. (B) Expected pedigree involved in production of F, hy- 
brids and various backcross classes, under the assumption that the hybridization 
events typically entailed matings of male H. cinerea with female H. gratiosa. In both (A) 
and (B), each letter refers to the species origin of the mtDNA genotype ("c", cinerea; 
“g”, gratiosa), and squares and circles indicate males and females, respectively. 





! 


TABLE7.5 Genetic architecture of a hybrid population involving the tree frogs 
Hyla cinerea and H. gratiosa? 
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Allozyme gratiosa-type mtDNA cinerea-type mtDNA 
category Observed Expected Observed Expected 
Pure H. gratiosa 103 = 0 x 
Pure H. cinerea 0 — 60 = 

F, hybrid 20 20 0 0 

H. cinerea backcross 22 29 36 29 

H. gratiosa backcross 5 58 1 0 
Later-generation hybrids 9 Some? 2 Some? 


Source: After Lamb and Avise 1986. 

"Shown are numbers of frogs in each hybrid or non-hybrid category as identified by multi-locus 
allozyme genotype, as well as the female parent species for those individuals as identified by 
mtDNA markers. Also shown are expected numbers based on the behaviorally motivated hy- 
pothesis (see text) that interspecific crosses are in the direction H. cinerea male x H. gratiosa fe- 
male, and that F, hybrids of both sexes (who thus have H. gratiosa mtDNA) have contributed 
equally to a given backcross category. 

t Both cinerea-type and gratiosa-type mtDNA genotypes are expected among later-generation hy- 
brids, but relative frequencies are dependent on additional factors and, thus, are hard to predict. 


tiosa). Furthermore, among the progeny of backcrosses to cinerea, individuals car- 
rying either gratiosa-type or cinerea-type mtDNA were both well represented 
(also as predicted, because the mtDNA genome transmitted in a given mating 
would depend on whether the F, hybrid parent was a male or female; see Figure 
7.9B). Nevertheless, asymmetric mating alone cannot explain all aspects of the 
data, because individuals with pure cinerea and pure gratiosa genotypes re- 
mained present in high frequency (Table 7.5). Additional factors may involve se- 
lection against hybrids or continued migration of parental species into the area. 
In formal models that allowed variation in parental immigration rates and in- 
cluded tendencies for positive assortative mating between conspecifics, As- 
mussen et al. (1989) found an excellent fit to the empirical cytonuclear data 
when, at equilibrium, about 32% of the inhabitants of the hybrid zone were pure- 
species immigrants in each generation. However, the possibility of selection 
against hybrids was not formally modeled. , 
How much of this pronounced genetic structure in the Hyla population 
would have been uncovered from a traditional morphological assessment 
alone? Lamb and Avise (1987) applied multivariate analyses to numerous phe- 
notypic characters in these same treefrog individuals and compared results 
against those obtained from the molecular genetic assessments. Although pure 
gratiosa and pure cinerea specimens (as classified by molecular genotype) could 
be distinguished cleanly by discriminate analyses of morphological characters, 
various hybrid classes proved less recognizable. Thus, by morphology, 18% of 
true F, hybrids were indistinguishable from pure parental species, 27% of back- 
crosses in either direction were not distinguished from F, hybrids, 50% of gra- 
tiosa backcross progeny were misidentified as pure gratiosa, and 56% of cinerea 
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backcross progeny were misidentified as pure cinerea. By contrast, expected mis- 
classification rates based on the molecular genotypes surveyed were invariably 
less that 4% (based on straightforward Mendelian considerations). Furthermore, 
the pronounced asymmetry in mating behavior that apparently exerted pro- 
found influence on the genetic architecture of this hybrid population would 
have remained completely undetected by morphological assessment alone. 


More hybrid zone asymmetries 


Numerous molecular genetic analyses of hybridization and introgression have 
appeared in the past three decades, with early and recent landmark reviews or 
compilations provided by Barton and Hewitt (1985; Barton 2001; Hewitt 2001), 


BOX 7.4 Cytonuclear Disequilibria in Hybrid Zones 


"Cytonuclear disequilibria (CD) are nonrandom associations within a population be- 
tween genotypes at nuclear and cytoplasmic loci (Arnold 1993; Clark 1994). Consid- 
era population whose individuals have been scored at a diploid autosomal gene 
and at a haploid cytoplasmic locus (mtDNA in animals; cpDNA or mtDNA in 
plants), and assume further that each locus has.two alleles. Six differentcytonuclear 
genotypes are possible. It is convenient to organize the data into a three-by-two table 
as follows (wherein each of the six cells in the table refers to the frequency of a cy- 





tonuclear genotype): 
Nuclear genotype 
Cytoplasm AA Aa aa Total 
M uy Ui w x 
m Lat v, w y 
Total . u v w 10 





Using such tables, Asmussen et al. (1987) introduced the following, four for- 

mal measures of genotypic and allelic cytonuclear disequilibria (D): 
Genotypic disequilibria 

D, = freq. (AA/M) - freq. (AA) freq. (M) = u, - ux 

D, = freq. (Aa/ M) - freq. (Aa) freq. (M) = 0, - vx 

D, = freq. (a2/ M) — freq. (aa) freq. (M) = w, - wx 

(Note: D, + Dj + D} =0) 
Allelic disequilibrium 


D = freq. (A/M) ~ freq. (A) freq. (M) = u, + 1/20, - (u + 1/20) x 
(Note: D = D, + 1/2Dj) 


b. Y r 
Speciation and Hybridization dy, k 


As shown in the following diagram (after Avise 2001c), various phenomena in hy- 
brid zones can leave characteristic CD signatures when the cytoplasmic genome is 
maternally inherited (Amold 1993). 


Nd 
e mrs 


Three-by-two table Cytonuclear signature Likely explanation 


D,=~D,;=D+#0 Absence of 
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discriminating species 


In these tables, plus signs indicate excesses and minus signs indicate deficits 
r (relative to random-mating expectations) in the observed frequencies of particular 
! cytonuclear genotypic classes. These various CD signatures are consistent with 
i (but not proof of) several possible hybrid zone phenomena described in the right- 
hand column. . 
For example, in a well-mixed hybrid swarm (case B above), observed (obs) 
frequencies of all six cytonuclear genotypes are in statistical accord (Asmussen and 
Basten 1994) with expectations (exp) based on products of the marginal frequen- 
cies of the single-locus genotypes, and all cytonuclear disequilibria are zero, In cas- 
es C and D above, the CD signatures suggest that hybridization was confined to 
the F, generation, perhaps due to hybrid sterility or other mechanisms of repro- 
ductive isolation. In case C, the interspecific matings occurred with equal likeli- 
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hood in either direction with respect to gender, whereas in case D there is evidence 
for a pronounced asymmetry such that females of only one species and males of 
only the other were primarily involved. 

Cytonuclear disequilibrium theory has been extended to other kinds of popu- 
lation genetic settings as well, such as nuclear-dicytoplasmic plant systems involv- 
ing both mitochondrial and chloroplast genomes (Schnabel and Asmussen 1989), 
paternal as well as maternal cytoplasmic inheritance (Asmussen and Orive 2000), 
the estimation of gene flow via pollen versus seeds (Goodisman et al. 2000; Orive 
and Asmussen 2000), haplodiploid species and X-linked genes (Goodisman and 
Asmussen 1997), apomictic species (Overath and Asmussen 2000a), and tetraploid 
species (Overath and Asmussen 2000b). 


Arnold (1992, 1997), Harrison (1993), and Richie and Butlin (2001), among others. 
Several classic studies will be encapsulated here to illustrate the diversity of is- 
sues addressed, with special attention devoted to additional categories of genet- 
ic asymmetry that frequently attend natural hybridization and introgression. 
Some types of asymmetries reflect differential compatibilities of introgressed al- 
leles on heterologous genomic backgrounds, a phenomenon that sometimes is 
revealed by significant contrasts across unlinked loci in the steepness, width, or 
placement of clines across a hybrid zone (Barton 1983). Other asymmetries may 
reflect differences between the sexes in genetic fitness or behavior (as in the Hyla 
treefrog example above), which are often revealed by contrasting patterns in cy- 
toplasmic and nuclear markers. 


DIFFERENTIAL INTROGRESSION AND MTDNA CAPTURE ACROSS A HYBRID ZONE. 
A classic hybrid zone, which has been examined using a variety of molecular 
markers in studies spanning three decades (Fel-Clair et al. 1998; Selander et al. 
1969), involves the house mice Mus musculus and M. domesticus. These forms, 
sometimes considered subspecies, meet and hybridize along a narrow line bi- 
secting central Europe (Figure 7.10). In one early analysis, Hunt and Selander 
(1973) surveyed diagnostic allozyme markers in nearly 2,700 mice from the 
contact zone in Denmark. They discovered free interbreeding within the hy- 
brid zone, as indicated by agreement of genotype frequencies with random- 
mating expectations; an asymmetry of introgression adjacent to the zone, with 
extensive introgression of some domesticus alleles into musculus, but little gene 
movement in the opposite direction; and a marked increase in the width of the 
hybrid zone in western Denmark as compared with the east, where 90% of the 
transition in genic characters occurred across a distance of only 20 km. The dif- 
ferent slopes and spatial patterns in the allelic clines that were observed across 
loci were interpreted as evidence of different selective values for various alleles 
on foreign genetic backgrounds. In other words, “selection against introgres~ 
sion of the genes studied (or chromosomal segments that they mark) is pre- 
sumed to involve reduced fitness in backcross generations caused by disrup- 
tion of coadapted parental gene complexes" (Hunt and Selander 1973). 


e. 
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Figure 7.10 Distributions of two species of house mice in central Europe. The 
heavy line indicates the position of a hybrid zone at the contact between Mus musculus 
(to the north and east, including Scandinavia) and M. domesticus (to the south and 
west). Open and solid circles indicate mtDNA genotypes normally characteristic of M. 
domesticus and M. musculus, respectively. The arrow indicates the postulated route of 
colonization of Scandinavia by female M. domesticus. Note that the mtDNA distribu- 
tions are strikingly discordant with the ranges of the two species as defined by mor- 
phology and nuclear genes (see text). (After Gyllensten and Wilson 1987b.) 


A molecular study of the 20-km-wide hybrid zone in southern Germany re- 
vealed that about 98% of the house mice there had backcross genotypes (Sage et 
al. 1986). Furthermore, these hybrids were unusually susceptible to parasitic 
pinworms, other nematodes, and tapeworms, leading to the conclusion that the 
hybrid zone acts as a low-fitness genetic sink that interferes with gene flow be- 
tween the species. (In plant hybrid zones as well, parasites or herbivores often 
are more abundant than in non-hybrid populations, presumably because hybrid 
individuals tend to have less effective defenses; Strauss 1994.) In the Mus hybrid 
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zone at various European sites, subsequent studies of Y-linked and X-linked 
markers, mtDNA haplotypes, and chromosomal features revealed sharp clines 
further indicative of genomic incompatibilities and severe restrictions on intro- 
gression at both nuclear and cytoplasmic loci (Boissinot and Boursot 1997; Fel- 
Clair et al. 1996; Tucker et al. 1992; Vanlerberghe et al. 1986, 1988). 

Mus musculus and M. domesticus normally differ sharply in mtDNA compo- 
sition (Ferris et al. 1983a,b), but an unexpected pattern emerged in parts of Scan- 
dinavia, where mice that by evidence from nuclear DNA and morphology ap- 
peared to be pure musculus nevertheless carried exclusively domesticus-type 
mtDNA (see Figure 7.10). Gyllensten and Wilson (1987b) proposed that the "for- 
eign” mtDNA originated from a small population of female domesticus that col- 
onized Scandinavia from a southern source (Prager et al. 1993), perhaps in asso- 
ciation with the spread of farming from northern Germany to Sweden some 
4,000 years ago. Continued backcrossing to musculus males might thereby have 
introduced mtDNA from domesticus into populations that retained a predomi- 
nant musculus nuclear genetic background. If this interpretation is correct, it pro- 
vides an example of the phenomenon of "mitochondrial capture," wherein 
mtDNA genotypes characteristic of one species sometimes occur against a pre- 
dominant nuclear background of another species because of past introgression 
between them. 

Many such cases of interspecific cytoplasmic capture (mtDNA in ani- 
mals, usually cpDNA in plants) have been reported (Harrison 1989). Repre- 
sentative examples are listed in Table 7.6. These studies (most of which provi- 
sionally eliminated lineage sorting from a polymorphic ancestor as an 
alternative explanation) document a widespread occurrence of historical 
gene exchange between species. They also highlight the considerable poten- 
tial risk of misinterpreting data from any single gene in studies of phylogeog- 
raphy and phylogenetics. In general, cytoplasmic capture is usually easier to 
document than is the capture of particular nuclear genes because whole 
linked blocks of non-recombinant markers in the mitochondrial (or chloro- 
plast) genome jointly signify the event. 


CHROMOSOMAL ALTERATIONS AS INTROGRESSION FILTERS. A recent theoreti- 
cal model focuses on structural rearrangements in chromosomes as impor- 
tant contributors to differential introgression as well as speciation (Navarro 
and Barton 2003a; Noor et al. 2001; Rieseberg 2001). The model is based on 
the well-supported notion that genetic recombination is reduced in chromo- 
somal regions heterozygous for structural alterations, which thus act as par- 
tial reproductive barriers. One net result can be the emergence between kary- 
otypically distinct populations of a semipermeable reproductive sieve, in 
which gene flow continues unabated in non-rearranged chromosomes, but is 
partially curtailed in the rearranged chromosomal segments. 

One prediction of this model is that positively selected genetic changes 
should accumulate preferentially in rearranged as opposed to non-rearranged 
chromosomal regions, a finding that found support in the following empirical 
test. Humans and chimpanzees differ in major chromosomal rearrangements 
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TABLE7.6 Known or suspected examples of cytoplasmic genomic “capture” 
reportedly due either to modern or ancient introgressive hybridization 
between related species 


Genus 





Animal mtDNA 
Caledia 
Clethrionomys 
Coregonus 
Drosophila 
Gila 
Gryllus 
Hyla 
Lepus 


Luxilus 
Micropterus 
Mus 

Mytilus 
Notropis, Luxilis 


Odocoileus 


Oreochromis 
Phrynosoma 
Rana 
Salvelinus 


Stercorarius 
Tamias 
Thomomys 
Zosterops 


Plant cpDNA 
Argyroxiphium 
Brassica 
Dubautia 
Eucalyptus 
Gossypium 


Helianthus 
Heuchera 


Persea 
Pinus 
Pisum 
Populus 
Quercus 
Salix 


Tellima 
Zea 


Common name 


Grasshoppers 
Voles 
Whitefishes 
Fruit flies 
Chub fishes 
Crickets 
Treefrogs 
Hares 


Cyprinid fishes 
Black bass 
House mice 
Mussels 
Minnows 


Deer 


Cichlid fishes 
Horned lizards 
Frogs 

Charr and trout 


Skuas (birds) 
Chipmunks 
Pocket gophers 
White-eyes (birds) 


Silverswords 
Cabbage and allies 
Silverswords 
Australian trees 
Cottons 


Sunflowers 
Heucheras 


Avocados 
Pines 
Peas 
Poplars 


Oaks 
Willows 


(Perennial herb) 
Teosintes, maize 


Reference 


Marchant 1988 

Tegelstróm et al. 1988 

Luet al. 2001 

Solignac and Monnerot 1986 

Gerber et al. 2001 

Harrison et al. 1987, 1997 

Lamb and Avise 1986 

Thulin et al. 1997; 
Alves et al. 2003 

Duvernell and Aspinwall 1995 

Avise et al. 1997 
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involving 10 of the 22 human autosomes. While searching existing databases 
for 115 protein-coding autosomal genes from these two primate species, 
Navarro and Barton (2003b) discovered that protein evolution was significant- 
ly faster in the rearranged than in the co-linear chromosomes. The authors in- 
terpreted these results as consistent with the possibility that full reproductive 
isolation between humans and chimpanzees evolved gradually with the accu- 
mulation of chromosomal rearrangements, and that during this time, genes in 
co-linear chromosomal regions may have continued to flow back and forth 
even as the reproductive barriers moved slowly toward completion. If this 
model can be generalized, it suggests that chromosomal rearrangements may 
be important factors contributing to the inter-locus variation in introgression 
patterns commonly observed in parapatric or secondary contact zones. 
Differential levels of hybrid-mediated genetic exchange across loci have 
also been suggested through an explicit gene tree approach. Machado and 
Hey (2003; see also Machado et al. 2002) used molecular sequence data to es- 
timate gene genealogies, separately, for each of 16 Joci in several closely relat- 
ed species of Drosophila. Nine of these gene trees are pictured in Figure 7.11, 
and they illustrate a remarkable heterogeneity of outcomes. Genealogies of 
five X-linked loci (A-E in Figure 7.11) each supported traditional thought 
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Figure7.11 Gene genealogies for nine loci surveyed in the closely related species 
Drosophila pseudoobscura (solid squares), D. persimilis (open circles), D. bogotana (solid 
circles), and D. miranda (open squares). Numbers on each tree indicate levels of boot- 
strap support for various branches. The scale bar represents nucleotide divergence per 
base pair. (After Machado and Hey 2003.) 
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about the phylogeny of a portion of this complex: [(pseudoobscura, bogotana), 
persimilis]. A mitochondrial gene tree (I) showed an entirely different pattern 
suggestive of recent gene flow between pseudoobscura and persimilis. Loci lo- 
cated on nuclear chromosomes 2, 3, and 4 showed other patterns. For exam- 
ple, two trees (F and H) showed no monophyly for any single species or pair 
of species, and one tree (G) showed monophyly of bogotana sequences, but 
lineage paraphyly for pseudoobscura and persimilis. These findings are impor- 
tant for several reasons. They provide one of the first available empirical as- 
sessments of multiple gene trees within and among closely related species. 
They show conclusively that composite genomes can have highly mosaic his- 
tories (and, hence, that estimating species phylogenies from only one or a few 
gene trees can have serious pitfalls). Finally, they prove that gene genealogies 
from unlinked loci can differ rather strikingly in topology, in this case proba- 
bly due to introgressive gene flow of some portions of the genome, but not 
others (although differential lineage sorting from polymorphic ancestors 
might also explain some of the gene tree heterogeneity). 

The following sections provide additional examples of how differential in- 
trogression can occur, especially via asymmetries stemming from different be- 
haviors or different fitnesses of males and females. Especially intriguing are the 
patterns of differential genetic exchange frequently observed between nuclear 
and cytoplasmic loci. Although patterns of variation for cytoplasmic and nuclear 
markers are highly concordant in some hybrid zones (e.g., Baker et al. 1989; Nel- 
son et al. 1987; Szymura et al. 1985), in others (e.g., the Hyla treefrogs discussed 
above) they may show pronounced discordances for a variety of reasons. 


HALDANE’S RULE. The heterogametic sex is the gender that carries unlike sex 
chromosomes. In humans, for example, males normally are heterogametic for 
sex chromosomes conventionally labeled X and Y, whereas in birds, females are 
heterogametic for Z and W. An empirical generality first noticed by Haldane 
(1922) is that “when in the F, offspring of two different animal races one sex is 
absent, rare, or sterile, that sex is the heterozygous [heterogametic] sex." Thus, 
in species with heterogametic males (such as mammals and fruit flies), male 
hybrids tend to show more severe reductions in viability or fertility, whereas in 
species in which females are heterogametic (such as birds and butterflies), fe- 
male hybrids more often show decreased fitness (Coyne and Orr 1989a; Orr 
1997; Presgraves 1997). In principle, such asymmetries could influence gene 
flow across hybrid zones. 

In birds and butterflies, introgression of nuclear DNA might occur via fer- 
tile male hybrids even if mtDNA introgression is blocked due to female sterili- 
ty. One possible example involves a hybrid zone between two European fly- 
catchers, Ficedula albicollis and F. hypoleuca. In accord with Haldane's rule, male 
hybrids in these birds are known to be more fertile than females (Gelter et al. 
1992; Seetre et al. 1999), and hybrids of both sexes are also known to be less fit 
genetically than pure parentals. Following earlier allozyme and mtDNA analy- 
ses by Tegelstróm and Gelter (1990), Sætre et al. (2001) compared microsatellite 
and mtDNA markers and found that mtDNA gene flow was somewhat lower 
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than nuclear gene flow in the hybrid zone, albeit not significantly so. In several 
other avian hybrid zones, mtDNA introgression has proved to be significantly 
diminished relative to that for at least some nuclear loci (Bensch et al. 2002; 
Brumfield et al. 2001; Helbig et al. 2001; Sattler and Braun 2000). Similarly, in a 
butterfly hybrid zone between Anartia fatima and A. amathea in Panama, re- 
duced fitness of female hybrids apparently has placed restrictions on levels 
and patterns of backcrossing (Davies et al. 1997). 

Also in accord with Haldane’s rule are voles (small mammals), in which in- 
terspecific crosses produce fertile female hybrids but sterile males (Tegelstróm et 
al. 1988). In populations of Clethrionomys glareolus and C. rutilus in northern Scan- 
dinavia, Tegelstróm (1987b) observed a pronounced discrepancy between 
species boundary and mtDNA phylogeny, leading him to conclude that rutilus- 
type mtDNA had introgressed into glareolus, perhaps following a limited hy- 
bridization episode dating to a postglacial colonization of the region some 10,000 
years ago. This historical scenario is strongly reminiscent of the suspected case of 
mtDNA capture (discussed above) in Scandinavian populations of Mus mice. 

In Drosophila hybrids, the heterogametic males often show partial or com- 
plete sterility, sometimes in one direction of a cross only. In studies spanning 
three decades, the genetic basis of hybrid sterility has been dissected using 
chromosomal and molecular markers, and the loci responsible have been 
mapped to sex chromosomes and various autosomes (Coyne and Berry 1994; 
Dobzhansky 1974; Kaluthinal and Singh 1998; Orr and Irving 2001; Vigneault 
and Zouros 1986). The homogametic female hybrids, by contrast, often remain 
fertile and thus provide what some researchers have interpreted as potential 
bridges for interspecific exchange of mtDNA via introgression (Powell 1983, 
1991). In D. mauritiana, for example, some individuals carry an mtDNA geno- 
type also found in nearby populations of D. simulans, an observation interpret- 
ed by Solignac and Monnerot (1986) to indicate recent introgression of simu- 
lans-type mtDNA into D. mauritiana. This hypothesis gained support from 
population cage studies in which the predicted takeover by D. simulans mt- 
DNA was documented experimentally over a few generations of introgressive 
hybridization (Aubert and Solignac 1990). On the other hand, DeSalle and Gid- 
dings (1986) reported that the mtDNA phylogeny for several closely related 
species of Hawaiian Drosophila matches the species phylogeny quite well, de- 
spite postulated historical introgression that sometimes has complicated phy- 
logenetic reconstructions based on nuclear genes. 


DIFFERENTIAL MATING BEHAVIORS. The Hyla example discussed earlier pro- 
vides a powerful illustration of how a behavioral asymmetry has influenced the 
genetic architecture of a hybrid zone. Another example involves a contact zone in 
France between hybridizing newts, Triturus cristatus and T. marmoratus (Arntzen 
and Wallis 1991). Allozymes again were employed to characterize the hybrid sta- 
tus of individuals, and mtDNA genotypes were used to identify the female par- 
ents. All F, hybrids possessed cristatus-type mtDNA, perhaps due to a strong 
asymmetry in mate choice. An absence of mtDNA introgression in areas where T. 
cristatus replaced T. marmoratus is also consistent with this interpretation. 
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Sequence data from the Y chromosome have been used in collaboration 
with data from mtDNA (and sometimes autosomal genes) to deduce sex-bi- 
ased patterns of dispersal and introgression in mammals. Cases in point in- 
volve hybridization between species of macaque (Macaca) monkeys, for 
which Y chromosome introgression has been reported in various settings in 
the absence of interspecific movement of mtDNA (Evans et al. 2001; Tosi et al. 
2002, 2003). Results probably reflect sex-biased dispersal in these primates, 
wherein males typically emigrate from their natal troops and engage else- 
where in matings with resident females (evidently including those of other 
species, on occasion). 

Sunfishes in the genus Lepomis are renowned for their propensity to hy- 
bridize, both in artificial ponds and in nature. As stated by Breder (1936), "There 
is probably no group of fishes, North American at least, in which there would 
seem to be a concatenation of reproductive and other events so well arranged as 
to lead to extensive hybridizing; i.e., the species are numerous; there is less geo- 
graphic separation than usual; spawning occurs at about the same temperature 
threshold; spawning sites are limited and similar for most species; nests are ex- 
changed among species." From observations of diminished hybrid fertility in 
both sexes, Hubbs and Hubbs (1933; see also Hubbs 1955) concluded that natu- 
ral hybridization probably was limited to the F, generation. However, later 
work with experimental populations revealed that ^a number of different kinds 
of hybrid sunfishes ... are not sterile, are fully capable of producing abundant F, 
and F, generations, and can be successfully backcrossed to parent species and 
even outcrossed to non-parental species" (Childers 1967). Nonetheless, about a 
dozen recognized species within the genus normally exhibit large genetic dis- 
tances at protein-coding loci (see Figure 7.2) and, for the most part, retain dis- 
tinctive morphological identities throughout their ranges. Thus, questions re- 
mained about the magnitude of introgression between Lepomis in nature. In one 
early study, Avise and Saunders (1984) characterized a total of 277 sunfish from 
two locations in northern Georgia for species-diagnostic allozymes and mt- 
DNA. The genetic data revealed a low frequency (5%) of interspecific hybrids 
(all of which appeared to be F, individuals), involvement of five sympatric 
species in the production of these hybrids, and no evidence for introgression at 
these study locales. Furthermore, most of these hybrids were between parental 
species that differed greatly in abundance and had mothers that were from the 
less common of the hybridizing species. These data suggest a density-depend- 
ent mating pattern in which a paucity of conspecific spawning stimuli and 
mates for females of the rarer species might be key factors increasing the likeli- 
hood of interspecific hybridization. 

Another molecular analysis of natural sunfish hybridization involved 
bluegill (Lepomis macrochirus) and pumpkinseed sunfish (L. gibbosus) in a south- 
ern Canadian lake. Among 44 phenotypically intermediate individuals exam- 
ined for allozymes and mtDNA, all proved to be F, hybrids with pumpkinseed 
mothers (Konkle and Philipp 1992). As described in Chapter 5, some bluegill 
males (but presumably not pumpkinseed males) display specialized fertiliza- 
tion-thievery tactics that they normally employ with considerable success in in- 
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traspecific spawns (Gross 1979; Gross and Charnov 1980). The gender asym- 
metry of interspecific hybridization is consistent with the postulate that male 
bluegilis also cuckold heterospecific pumpkinseed males (albeit to no avail in 
terms of the bluegills' ultimate genetic fitness). 


DIFFERENTIAL GAMETIC EXCHANGE. Especially in plants, a pronounced un- 
coupling of male and female components of gene flow across species is possi- 
ble due to the two distinct avenues for gene movement: pollen and seeds 
(Paige et al. 1991). When genetic transfer is mediated solely by pollen, mater- 
nally inherited genetic markers (e.g., cpDNA in most angiosperms) cannot in- 
trogress, whereas seed migration into a foreign population might lead to intro- 
gression of both nuclear and cytoplasmic genes when the resulting plants are 
fertilized by pollen from resident individuals. Clearly, alternative modes of 
gene transfer in plants can leave different signatures on cytonuclear associa- 
tions in hybrid zones (Asmussen and Schnabel 1991). 

In a series of observational and experimental studies involving allozymes, 
cpDNA, and other molecular markers, Arnold et al. (1990a,b, 1991, 1992; see re- 
views in Arnold 2000; Arnold and Bennett 1993; Johnston et al. 2001) have doc- 
umented many ecological and genetic aspects of introgression between 
Louisiana irises (genus Iris), a classic example of a plant taxonomic complex 
engaged in hybridization (Anderson 1949). For example, the authors found 
asymmetric gene flow between Iris fulva and I. hexagona at particular locales 
(Figure 7.12): Many individuals of hybrid ancestry (as evidenced by recombi- 
nant nuclear genotypes) nonetheless retained cpDNA of I. hexagona, indicating 
that they were products of pollen transfer from I. fulva onto I. hexagona flowers. 
In such cases, introgression of nDNA markers occurred in the absence of 
cpDNA transfer. Results appear to be consistent with Iris natural history, which 
includes pollen movement by mobile bumblebees and hummingbirds, but 
only short-distance seed dispersal. 

Might long-distance pollen movement lead to the exceptionally wide areas 
of introgression that have been postulated for some plant species complexes? 
In the southeastern United States, three parapatric species of buckeye trees 
(Aesculus sylvatica, A. flava, and A. pavia) appear to have experienced introgres- 
sion across a region at least 200 km wide, as inferred from patterns of morphol- 
ogy, geographic distribution, and meiotic irregularities associated with de- 
creased germination of pollen from putative interspecific hybrids. Allozyme 
data gathered for these species are also consistent with introgression scenarios, 
and they further raise the possibility that long-distance gene movement be- 
yond the hybrid zone (as recognizable by morphology) has taken place (de- 
Pamphilis and Wyatt 1990). For example, one suspected hybrid region appears 
to be highly asymmetric, with alleles characteristic of coastal-plain A. pavia also 
found in Piedmont populations, where A. sylvatica normally occurs. The au- 
thors hypothesized that hummingbirds (important pollination agents for buck- 
eyes) may have effected long-distance pollen flow during their northward 
spring migration, thus accounting for the width and asymmetry of the hybrid 
zone (dePamphilis and Wyatt 1989). 
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Figure 7.12 Asymmetric introgression between Iris fulva and I. hexagona, probably 
resulting from pollen flow. Each circle represents a single plant. Left, relative propor- 
tion of I. fulva (shaded) and 1. hexagona (unshaded) nuclear markers. Right, similar rep- 
resentation for maternally transmitted chloroplast DNA markers. Note especially the 
population between the road and the bayou, in which multi-locus nuclear genotypes 
suggest the presence of advanced-generation hybrids or backcrosses, but cpDNA 
markers suggest the absence of seed dispersal from I. fulva. (After Arnold 1992.) 


Once pollen has arrived on a heterospecific style, additional challenges 
await it that could lead to asymmetric barriers to successful hybridization. Ex- 
perimental studies on Louisiana irises illustrate what can happen. In pollen 
competition experiments in which pollen grains from two or more species 
were placed on individual stigmas, genetic marker analyses of the resulting 
progeny revealed that homospecific pollen tubes typically out-competed (or 
out-paced) heterospecific pollen tubes in their race toward the ovary (Carney et 
al. 1996; Emms et al. 1996). The primary exceptions occurred only when foreign 
pollen grains were given a significant head start in these egg fertilization con- 
tests. Similar experiments on "interspecific pollen competition" in Helianthus 
sunflowers again revealed reproductive barriers between species, but no evi- 
dence for differential pollen tube growth rates in this case (Rieseberg et al. 
1995b). Another example, but with yet a different pattern, involves Eucalyptus 
trees. In interspecific crosses, E. nitens pollen tubes grow slowly and never 
reach full length in the larger E. globulus styles, whereas globulus pollen tubes 
grow rapidly in nitens styles and enter the ovary (Gore et al. 1990). Hybridiza- 
tion between several species of Eucalyptus occurs rather commonly in nature 
(Griffin et al. 1988; Jackson et al. 1999), and the unilateral cross-incompatibility 
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mediated by this asymmetry might play an important role in structuring cy- 
tonuclear associations—for example, those involving cpDNA, which is mater- 
nally inherited in these trees (McKinnon et al. 2001). 

Of course, a variety of other selective mechanisms operating at prezygotic 
or postzygotic stages also could lead to asymmetric introgression in Eucalyptus 
species, and in other organisms (Gore et al. 1990; Potts and Reid 1985). One 
such postzygotic mechanism is cytoplasmic male sterility (CMS), wherein hy- 
brid males tend to show greatly reduced fertility. CMS is common in arthro- 
pods, for example, where it is often attributable at least in part to the presence 
of cytoplasmically housed Wolbachia bactería (and other intracellular microbial 
taxa; Stouthamer et al. 1993) that usually are transferred to hybrid progeny 
through females, but not males (O'Neill et al. 1997; Werren 1998). Thus, this 
asymmetry could leave a genetic footprint on hybridizing populations (Gior- 
dano et al. 1997). In one genetic test of this possibility, Mandel et al. (2001) used 
molecular markers to assay for presence versus absence of Wolbachia in a hy- 
brid zone of field crickets (Gryllus). In this case, however, results indicated that 
Wolbachia were unlikely to have been involved in the unidirectional incompat- 
ibility of hybrid crosses between G. firmus and G. pennsylvanicus. 

CMS is also a widespread phenomenon in plants (Edwardson 1970), in 
which it often involves pollen abortion in hybrid progeny due to an interaction 
of cytoplasmically transmitted mutations (usually in mtDNA) with a foreign 
nuclear background (Budar et al. 2003; Hanson 1991). Male sterility in first- 
generation hybrids or backcross progeny could have the effect of attenuating 
the introgression of nuclear genes, while nonetheless leaving an open avenue 
for cytoplasmic transfer via fertile females. CMS is known to occur, for exam- 
ple, in hybrids between some Helianthus species that have figured prominently 
in discussions of inter-taxon exchange of cpDNA, as described next. 


RETICULATE EVOLUTION AND CPDNA CAPTURE. Due to a strong propensity 
for introgressive hybridization in many plant taxa, botanists have been espe- 
cially concerned with the possibility of widespread reticulate evolution (Grant 
1981; Rieseberg and Morefield 1995; Stebbins 1950), wherein the phylogeny of 
a particular taxonomic group might be anastomotic or “netlike” rather than 
strictly dichotomous and branched. Molecular analyses based on phylogenet- 
ic contrasts between nDNA and cpDNA markers have provided considerable 
support for this phenomenon (see Table 7.6). The usual evidence consists of a 
gross incongruity between the phylogeny of a particular gene (typically 
cpDNA) and the consensus species phylogeny from other sources of informa- 
tion, including nuclear markers. One powerful explanation for such incon- 
gruities is ancient cytoplasmic capture via introgressive hybridization. How- 
ever, caution should be exercised before accepting poorly documented cases 
because other evolutionary processes (such as idiosyncratic sorting of poly- 
morphic lineages that had been retained across temporally close speciation 
events; Moran and Kornfield 1993) can also generate topological discordance 
between a gene tree and a species tree (see Chapters 4 and 8; see also Riese- 
berg et al. 1996b). 
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In Helianthus sunflowers, gross discrepancies between phylogenies esti- 
mated from morphology, chromosomal variation, experimental crossing suc- 
cess, and various molecular markers have shown conclusively that both re- 
cent and relatively ancient episodes of inter-taxon gene exchange have 
produced a reticulate pattern of relationships among these species (Rieseberg 
1991; Rieseberg et al. 1988, 1991). For example, a cpDNA-based phylogeny for 
numerous Helianthus taxa contrasts dramatically (at the indicated positions in 
the phylogeny presented in Figure 7.13) with suspected relationships based on 
morphological characters and on nuclear rDNA sequences. Thus, each of five 
species (H. anomalus, H. annuus, H. debilis, H. neglectus, and H. petiolaris) pos- 
sesses some highly distinct cpDNA genotypes otherwise characteristic of dif- 
ferent clades within the genus. Furthermore, some species appear to have cap- 
tured cytoplasms of other species on multiple occasions. For example, H. 
petiolaris acquired the cytoplasm of H. annuus at least three times, as evidenced 
by the geographic sites of introgression and by the particular cpDNA geno- 
types of H. annuus that were exhibited (Rieseberg and Soltis 1991). Another 
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Figure7.13 Early evidence for cytoplasmic introgression in Helianthus sunflowers. 
Shown is a parsimony tree based on cpDNA data; shaded branches indicate areas of 
overt conflict with a morphological classification. These discrepancies are indicative of 
interspecific cpDNA transfer mediated by hybridization. Numbers are levels of boot- 
strap support for putative clades. Asterisks specify taxa that are stabilized hybrid de- 
rivatives (see text). (After Rieseberg and Soltis 1991.) 
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well-documented example of apparent cytoplasmic capture and reticulate 
i evolution involves Gossypium cottons (Wendel and Albert 1992; Wendel et al. 
i 1991), a phylogeny for which is shown in Figure 7.14. Rieseberg (1995) con- 
i cludes that such situations are rather common in plants and that “detailed 
surveys of the current literature might yield over 100 potential examples.” 

In general, like animal mtDNA, the cpDNA molecule may be especially 
helpful in revealing cases of reticulation because its clonal haploid transmis- 
sion allows particular ancestral sources to be identified without the complica- 
tion of recombination (including recombination at the intragenic scale), 
which can lead to a mosaic ancestry for the nuclear genome. Apart from this 
detection bias, cytoplasmic DNA might tend to introgress more readily than 
nuclear DNA if genes contributing to reproductive isolation are housed pri- 
marily in the nucleus (Barton and Jones 1983). Indeed, in a detailed genetic 
dissection of three natural hybrid zones between the sunflowers Helianthus 
petiolaris and H. annuus, based on 88 marker loci distributed across 17 chro- 
mosomes, Rieseberg et al. (1996a, 1999) found introgression to be significant- 
ly reduced relative to neutral expectations for 26 chromosomal segments, 
suggesting that each of these nuclear segments contained factors contributing 
to RIBs. On the other hand, an opposite argument can be made: Because cyto- 
plasmic DNA is haploid and is therefore potentially exposed to selection in 
all individuals, and because it houses non-recombining genes, several of 
whose protein products must functionally interact with those of nuclear 
genes, cytoplasmic introgression might be severely impeded relative to that 
of typical nuclear DNA. If so, then the reticulation events revealed most 
clearly by cytoplasmic captures might be only the tip of the iceberg of inter- 
lineage gene exchange (see Chapter 8). 

In summarizing this section, it is abundantly clear that hybridization phe- 
nomena are best analyzed through multiple lines of evidence involving several 
types of molecular (and other) markers. Barriers to reproduction between close- 
ly related taxa are seldom absolute, and they often appear differentially “semi- 
permeable” to cytoplasmic and various nuclear alleles. Thus, a rich and varied 
fabric of gene genealogies (seldom evident from traditional morphological as- 
sessment alone) characterizes many hybrid zones, revealing varying degrees of 
reticulation among the phylogenetic branches connecting related species. 


More hybrid zone phenomena 


CONSISTENCY OF OUTCOMES? Hybrid zone contacts tend to be rather idiosyn- 
cratic evolutionary happenings, so it is seldom possible to examine perfect 
replicates in nature and thereby critically assess any repeatability in outcomes. 
Even when multiple transects along a linear or mosaic hybrid zone are exam- 
ined, uncontrolled key variables, such as genetic background or ecological set- 
ting, may differ. An alternative is to replicate hybrid zones experimentally, ini- 
tiating them with characterized genetic stocks under controlled ecological 
conditions and then monitoring the temporal course of hybridization and in- 
trogression (Emms and Arnold 1997; Hodges et al. 1996). 
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Figure7.14 Heuristic representation of a cpDNA-based phylogeny for 34 diploid 
species of Gossypium. Dashed lines crossing solid lines indicate probable instances of 
cytoplasmic capture and, hence, reticulate evolution. (After Wendel and Albert 1992.) 
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An example of this approach involved two species of mosquitofish (Gam- 
busia affinis and G. holbrooki) that, as judged by allozyme and mtDNA markers, 
hybridize naturally across a broad portion of the southeastern United States 
(Scribner and Avise 1993a). Scribner and Avise (1994a) examined replicated 
sets of experimental pond and pool populations, each initiated with specified 
equal numbers of adult male and female affinis and holbrooki. These captive 
populations were then monitored periodically for changes in cytonuclear ge- 
netic composition over a 2-year period. The dynamics of hybridization and in- 
trogression proved to be remarkably consistent among replicates. In each case, 
there was an initial flush of hybridization, followed by backcrossing and a 
rapid decline in the frequencies of affinis nuclear and cytoplasmic alleles. Sim- 
ilar outcomes across 2 years were also observed in experimental Gambusia 
populations set up in more complex habitats inside the Biosphere 2 facility in 
Arizona (Scribner and Avise 1994b). Overall, the genetic analyses documented 
extensive introgressive hybridization accompanied by strong directional se- 
lection that promoted rapid and consistent evolutionary changes favoring hol- 
brooki alleles (Figure 7.15). The results were also relevant to behavioral, demo- 
graphic, and life history aspects of natural hybridization between these small 
fishes (Scribner and Avise 1993b). 


HYBRID FITNESS. One traditional notion is that interspecific hybridization is 
costly to the participants, typically yielding progeny with diminished fitness 
and resulting in hybrid zones that act as genetic sinks (Barton 1980). In recent 
years, there has been a resurgence of interest (Arnold 1997; Dowling and Secor 
1997; Rieseberg 1995) in an old suggestion (Anderson and Stebbins 1954; 
Lewontin and Birch 1966) that, by virtue of their possessing novel recombinant 
genotypes, some hybrid populations might also be sources of adaptive evolu- 
tion and lineage diversification. 

One prediction of the latter view is that the fitnesses of hybrid organisms 
should sometimes surpass those of their parents. Indeed, studies employing 
molecular markers (often in conjunction with other information used to identi- 
fy hybrid classes and to estimate various components of reproductive fitness) 
have shown that particular recombinant genotypes in at least some ecological 
settings do outperform the non-hybrid genotypes of parentals. Examples have 
been documented in a wide variety of animal and plant taxa ranging from 
Sceloporus lizards (Reed and Sites 1995; Reed et al. 1995a,b) to Mercenaria clams 
(Bert and Arnold 1995) to Artemesia sagebrushes (Freeman et al. 1995, 1999; 
Graham et al. 1995; Wang et al. 1997) and Iris plants (Arnold et al. 1999; Burke 
et al. 1998). In a review of 37 research studies on this topic, Arnold and Hodges 
(1995) reported that hybrids possessed the highest fitness values in 5 cases 
(13%), were equivalent to the most fit parental class in 15 cases (40%), were in- 
termediate to the two parents in 7 cases (20%), and were the least fit in 10 cases 
(27%). Even if they represent only a minority of outcomes (Burke and Arnold 
2001), such instances of increased fitness of recombinant genotypes in hybrid 
zones are certainly intriguing. 
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Figure7.15 Temporal genetic changes in four sets of experimental Gambusia hybrid >» 
populations monitored over 2 years. Shown are observed frequencies of non-hybrid 
specimens (pure holbrooki or pure affinis), probable F, hybrids, backcrosses in either di- 
rection, and later-generation hybrids, as gauged by species-diagnostic mtDNA and nu- 
clear (allozyme) markers. Numbers above histogram bars are observed percentages of 
specimens with holbrooki-type mtDNA. Pools and ponds were located at the Savannah 
River Ecology Lab (SREL). (After Avise 2001c.) 


There are two basic ways in which natural hybridization might contribute 
significantly to longer-term evolution: by transferring adaptations from one 
taxon to another and by promoting the foundation of new evolutionary line- 
ages (i.e., new species). The first type of creative potential is probably evi- 
denced by at least some of the many remarkable instances of interspecific 
gene capture (discussed above) mediated by introgressive hybridization. Such 
cases provide prima facie evidence that transferred genes (which often in- 
volve whole cytoplasmic genomes) clearly can function at least adequately in a 
heterologous genetic background, despite what must have been earlier long- 
term independent evolution in separate species before the introgression 
events. The second type of creative potential for hybridization— generating 
new species—seems even more remarkable, but this phenomenon is also well 
documented, as described next. 


Speciation by hybridization 


Differentiated genomes brought together through hybridization can sometimes 
produce a new species. One such mechanism, allopolyploidization, already has 
been described (see Box 7.3). Surveys of molecular markers have revealed oth- 
er hybridization-mediated routes to speciation as well (Abbott 1992). 


DIPLOID OR HOMOPLOID SPECIATION. In the plant literature, a traditional pro- 
posal has been that species isolated from each other by a chromosomal sterility 
barrier might give rise via hybridization to new fertile diploid species that are 
at least partially reproductively isolated from both parents (Grant 1963; Steb- 
bins 1950). Such "recombinational speciation" (Grant 1981), without change in 
chromosome number, was verified by the experimental synthesis of new hy- 
brid species under artificial conditions (see review in Rieseberg et al. 1990a). 
However, questions remained about the prevalence of this speciation mode in 
nature. In various plant taxa, numerous candidates for hybrid species were 
identified in early studies of morphology, ecology, and geographic ranges, but 
full confirmation of hybrid ancestry for particular species awaited the applica- 
tion of molecular markers (Rieseberg 1997). 

A diploid annual plant native to southern California, Stephanomeria diegen- 
sis, was suspected to have originated by stabilization of recombinant genotypes 
from a natural cross between two divergent diploid relatives (S. exigua and S. 
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virgata) with the same chromosome number. Gallez and Gottlieb (1982) demon- 
strated that S. diegensis indeed displays an additive profile of allozyme alleles 
characteristic of its presumed relatives, a finding consistent with the plants' in- 
termediate morphology and karyotype and supportive of its postulated hybrid 
origin. Similarly, findings based on more than a hundred nuclear RAPD mark- 
ers are consistent with a hybrid origin of the diploid shrub Encilia virginensis 
from two other diploid congeners, E. actoni and E. frutescens (Allan et al. 1997). 
Another plant species of putative hybrid origin, Iris nelsoni, proved upon mo- 
lecular analysis to possess a combination of nuclear genes characteristic of 
three species—I. fulva, I. hexagona, and I. brevicaudis—that all appear to have 
been involved in its formation (Arnold et al. 1990b, 1991). However, not all mo- 
lecular genetic reappraisals have confirmed the suspected hybrid origins of 
problematic plant taxa. The diploid annual Lasthenia burkei proved not to pos- 
sess a combination of allozyme alleles present in L. conjugens and L. fremontii, 
thus disputing earlier hypotheses that L. burkei is a stabilized hybrid derivative 
of those two species (Crawford and Ornduff 1989). 

Rieseberg et al. (1990a) noted that, by hard criteria, apparent additivity of 
alleles in the nuclear genome is insufficient to confirm the hybrid origin of a 
diploid species because it does not exclude the possibility that the taxon in 
question is ancestral to its putative parents. One solution to this problem is to 
use additional markers (such as those from cpDNA) to establish the evolu- 
tionary polarity of relationships. Applying this philosophy to allozymes and 
cpDNA markers in diploid sunflowers, Rieseberg et al. (1990a, 1995c) con- 
cluded that three problematic taxa, appropriately named Helianthus paradoxus, 
H. anomalus, and H. neglectus, had dissimilar pathways of origin. The last 
species in this list appears to be a recent non-hybrid derivative of H. petiolaris, 
but the first two species did prove to be hybrids having arisen from crosses 
between H. annuus and H. petiolaris. The analysis of H. anomalus also illustrates 
the exceptional refinement of some of these genomic dissections of hybrid 
speciation. Rieseberg et al. (1995c) employed approximately 200 mapped 
RAPD markers to identify the precise genomic linkage blocks in H. anomalus 
that had stemmed (perhaps on multiple independent occasions; Schwarzbach 
and Rieseberg 2002) from the two parental species that produced this hybrid 
taxon (Figure 7.16). 

Rieseberg et al. (1996c) also experimentally synthesized three independent 
hybrid lineages (by crossing H. annuus with H. petiolaris), then compared the 
genomic compositions of these synthetic hybrids with that of natural H. anom- 
alus. Their most important discovery was that patterns of introgression of the 
two parental genomes were significantly correlated across all of these hybrids. 
In other words, particular blocks of introgressed loci were nonrandomly simi- 
lar in all cases, thus strongly indicating that natural selection, rather than 
chance, often governed the detailed genomic composition of the hybrid 
species. Further genomic and phenotypic comparisons of ancient and synthet- 
ic hybrids between Helianthus species have also shown how complementary 
gene action following hybridization may produce extreme phenotypes that can 
facilitate adaptive evolution and ecological divergence (Rieseberg et al. 2003). 
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Figure7.16 Genetic linkage maps for nearly 200 RAPD markers in an ancient hybrid 
species, Helianthus anomalus, and in the two parental species from which it was derived. 
Linkage groups are arranged into three structural sets: (A) complete co-linearity, (B) inver- 
sions, and (C) inter-chromosomal translocations. For H. anomalus, letters in parentheses af- 
ter each marker indicate parental origin (p, petiolaris; a, annuus; a/p, either annuus or petio- 
laris; u, unique to anomalus). Lines between linkage groups connect loci that are known to 
be homologous between the hybrid and its parental species. Underlines denote loci that 
are homologous between H. annuus and H. petiolaris. (After Rieseberg et al. 1995c.) 
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Less attention has been devoted to the possibility of homoploid speciation 
via hybridization in animals. On the basis of morphological and allozyme data, 
Highton et al. (1989) suggested that the salamander Plethodon teyahalee had a hy- 
brid origin from past interbreeding between P. glutinosus and P. jordani. Several 
other such cases have been postulated (but seldom well documented) in mol- 
tusks, crustaceans, insects, and various vertebrates (see review in Dowling and 
Secor 1997). Among fishes in particular, several examples of stabilized hybrid 
forms have been suspected from morphological or distributional considerations. 
For example, the “Zuni sucker” in the Little Colorado drainage of the western 
United States was suggested to be a hybridization-derived intermediate between. 
Catostomus discobolus of the Colorado drainage and C. plebeius of the Rio Grande, 
and the “white shiner” in the Roanoke and adjacent drainages of the eastern 
United States was proposed to have arisen from hybridization between nearby 
Luxilus cornutus and L. cerasinus. However, in the case of the Zuni sucker, molec- 
ular reevaluations did not support the hybrid origin scenario (Crabtree and Buth 
1987), and in the case of the white shiner, molecular data were equivocal due to 
the difficulty of distinguishing among the possibilities of ancestral polymor- 
phism, convergence, and past hybridization, all of which could account for the 
observed allozyme distributions (Meagher and Dowling 1991). 

Molecular reappraisals have provided support for postulated hybrid origins 
of a few recognized fish species. By virtue of its mosaic genetic structure, as reg- 
istered in mitochondrial and nuclear gene assays, one cichlid species (Neolam- 
prologus marunguensis) in Africa's Lake Tanganyika appears to have arisen via 
introgressive hybridization between two ancient and genetically distinct species 
(Salzburger et al. 2002). In the case of the cyprinid Gila seminuda of the Virgin 
River in the western United States, an intermediate morphology led to suspicion 
that this species also was a hybrid, derived in this case from crosses between the 
roundtail chub (G. robusta) and bonytail chub (G. elegans). G. seminuda proved to 
be polymorphic for allozyme alleles at two loci otherwise diagnostic for the pu- 
tative parental taxa; as judged by mtDNA, the matriarchal lineage retained by 
G. seminuda derived from G. elegans (DeMarais et al. 1992). Phylogenetic analy- 
ses of additional Gila species (Dowling and DeMarais 1993) identified phyloge- 
netic conflicts between nuclear and mtDNA markers, a finding further sugges- 
tive of past episodic introgression. As noted by the authors, such stabilized 
hybrid derivatives might be relatively common in some groups of fishes, but re- 
main unrecognized due to a lack of detailed molecular studies. As also noted by 
the authors, the formal taxonomic status of such introgressed forms will proba- 
bly remain a point of contention, especially when the population in question is 
currently isolated from its parental species by extrinsic (geographic) barriers to 
reproduction. Should the hybrid form be considered a distinct population, a 
subspecies, or a species? This question is not merely academic, but is relevant to 
the implementation of conservation programs (see Chapter 9). 


ORIGINS OF UNISEXUAL BIOTYPES. Unisexual “biotypes” that reproduce by 
parthenogenesis, gynogenesis, or hybridogenesis (see Figure 5.4) are not bio- 
logical species in the usual sense applied to sexually reproducing taxa, but they 
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are nonetheless isolated genetically from their sexual relatives (as well as from 
other unisexual lineages) and are typically afforded formal taxonomic recogni- 
tion. Parthenogenetic forms are found in a variety of taxonomic groups, and 
many of them have been examined for evolutionary origins using molecular 
markers. For example, using mtDNA sequence data, Johnson and Bragg (1999) 
identified the female of the sexual species involved in hybrid-mediated gene- 
ses of asexual Campeloma snails in the southeastern United States, and they also 
showed that the parthenogenetic taxa were of polyphyletic origin. Similarly, 
Delmotte et al. (2003) used nuclear and cytoplasmic markers to show that 
many parthenogenetic aphid strains arose through multiple hybridization 
events between Rhopalosiphum padi and an unknown sibling species. Among 
the vertebrates, essentially all of the 70 known unisexual taxa also arose 
through hybridization events between related sexual species, and molecular 
markers have likewise been highly informative in revealing the mode of origin 
and parentage of these all-female biotypes. 

In most cases, the particular bisexual progenitors of various unisexual ver- 
tebrate taxa were suspected from earlier comparisons of morphology, kary- 
otype, or geographic range, but molecular surveys of allozymes and mtDNA 
have confirmed those hybrid origins in several instances and, for the first ime, 
revealed the sexual directions of the original crosses. For example, a gyno- 
genetic livebearing fish, Poecilia formosa, in northeastern Mexico exhibits nearly 
fixed heterozygosity at numerous protein and allozyme loci that distinguish or 
are polymorphic in the sexual species P. latipinna and P. mexicana (Abramoff et 
al. 1968; Balsano et al. 1972; Turner 1982). These molecular findings confirmed 
earlier evidence from morphology and geography that P. formosa arose via hy- 
bridization between these sexual species. The gynogen P. formosa also carries 
the mtDNA of P. mexicana, which is highly divergent from that of P. latipinna, 
indicating that the direction of the initial hybrid cross(es) was P. mexicana fe- 
male x P. latipinna male (Avise et al. 1991). Similar molecular inspections have 
allowed unambiguous determination of the sexual progenitors for more than 
25 unisexual biotypes (examples are given in Table 7.7). 

A few unisexuals carry genomic contributions from more than two sexual 
ancestors. The gynogenetic fish Poeciliopsis monacha-lucida-viriosa, for example, 
includes P viriosa nuclear genes apparently introgressed as a result of occasion- 
al matings of P. monacha-lucida females with P. viriosa males, rather than with 
males of their usual sexual host, P. lucida (Vrijenhoek and Schultz 1974). As 
judged by allozymes and other evidence, several triploid parthenogenetic 
lizards in the genus Cnemidophorus also carry genes from three sexual progeni- 
tors (Dessauer and Cole 1989; Good and Wright 1984), probably as a result of 
multiple hybridization events involving these species. 

In many cases, molecular analyses have further pinpointed the geographic 
and genetic sources of particular unisexual biotypes. For example, based on 
mtDNA comparisons, the matrilineal components of nine unisexual biotypes 
in the sexlineatus group of Cnemidophorus lizards all appear to stem from fe- 
males within one of the four nominate geographic subspecies of C. inornatus: C. 
i. arizonae (Densmore et al. 1989a). Similar molecular inspections have traced 
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the maternal ancestry of five triploid unisexual strains in the Poeciliopsis 
monacha-lucida complex of fishes to a bisexual species, P. monacha, from the Río 
Fuerte in northwestern Mexico (Quattro et al. 1992b). 

Typically, the bisexual relatives of unisexual species have proved highly dis- 
tinct in their mtDNA genotypes, whereas the mtDNAs of the unisexuals are 
closely related to or indistinguishable from those of one of the sexual progeni- 
tors. Thus, an emerging generalization is that most extant unisexual biotypes 
originated through asymmetric hybridization events occurring in one direction 
only (e.g., A female x B male versus B female x A male). Whether this phenome- 
non reflects some asymmetric mechanistic constraint on the origin of unisexuals 
or merely the survival of a limited subset of lineages from crosses in both direc- 
tions generally remains unclear. However, in the case of the Poeciliopsis hybrido- 
gens, laboratory crosses of P. monacha females x P. lucida males sometimes result 
in the spontaneous production of viable hybridogenetic lineages, whereas the 
reciprocal matings do not (Schultz 1973). This direction of crossing is consistent 
with the molecularly inferred origins of natural hybridogenetic strains, all of 
which possess monacha-type mtDNA (Quattro et al. 1991). Furthermore, these 
extant natural hybridogens have arisen multiple times through separate hy- 
bridization events, as gauged by their links to several different branches in the 
mtDNA phylogeny of their maternal ancestor P. monacha (Figure 7.17). 

One exception to such straightforward hybrid origins involves the hybri- 
dogenetic frog Rana esculenta of Europe, in which individuals exhibit mtDNA 
genotypes normally characteristic of either R. lessonae or R. ridibunda (Spolsky 








Unisexual biotype Ploidy level Reproductive mode? 
Cnemidophorus (lizards) 

uniparens 3n P 

tesselatus 2n P 

velox 3n P 

laredoensis 2n P 
Heteronotia binoei (lizard) 3n P 
Menidia clarkhubbsi (fish) 2n G 
Phoxinus eos-neogaeus (fish) 2n, 3n G 
Poecilia formosa (fish) 2n G 
Poeciliopsis (fishes) 

monacha-lucida 2n H 

monacha-occidentalis 2n H 





Note: See Avise et al. 1992c for an extended list. 

^P = parthenogenetic; G = gynogenetic; H = hybridogenetic (see Figure 5.4). 

b The bisexual parental species were identified in each case from allozymes, morphology, 
karyotype, geographic range, or other information. The female parent species was identified 
by mtDNA comparisons. 
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Figure7.17 Relationships among mtDNA haplotypes in the sexual fish Poeciliopsis 
monacha (open circles) and its unisexual derivative, P. monacha-lucida (shaded circles). 
Slashes are inferred mutations along the parsimony network; dashed lines indicate al- 
ternative network pathways. (After Quattro et al. 1991.) 
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and Uzzell 1986). The hybridogen R. esculenta is unique among assayed "asex- 
ual” biotypes in consisting of high frequencies of both males and females. 
From behavioral considerations, the initial hybridizations producing R. esculen- 
fa were postulated to involve male R. lessonae x female R. ridibunda. Once the 
hybridogen was formed, occasional matings of male R. esculenta with female R. 
lessonae may have introduced lessonae-type mtDNA secondarily into R. esculen- 
ta. Furthermore, females belonging to such R. esculenta lineages appear to have 
served as a natural bridge for interspecific transfer of lessonae mtDNA into cer- 
tain R. ridibunda populations via matings with R. ridibunda males (Spolsky and 
Uzzell 1984). Such crosses apparently produced "R. ridibunda” frogs with nor- 
mal nuclear genomes (because R. lessonae chromosomes are excluded during 
meiosis), but with lessonae-type mtDNA. 

Another complex scenario surrounds the hypothesized maternal ancestry 
of the triploid salamander Ambystoma 2-laterale-jeffersonianum, which, accord- 
ing to allozyme evidence, contains nuclear genomes of the bisexual species A. 
laterale and A. jeffersonianum, but reportedly carries mtDNA from A. texanum 
(Kraus and Miyamoto 1990). The authors favor an explanation in which an 
original A. laterale-texanum hybrid female produced an ovum with primarily A. 
laterale nuclear chromosomes, but the female-determining sex chromosome 
(W) and the mtDNA of A. fexanum. If such a female were fertilized by a male A. 
laterale, female progeny with two A. laterale nuclear genomes and the mtDNA 
of A. texanum would result. Subsequent hybridization with male A. jeffersoni- 
anum could then produce the observed A. 2-laterale-jeffersonianum biotypes car- 
rying fexanum-type mtDNA. Although this scenario is speculative, its mere fea- 
sibility suggests that distinct reticulate histories could characterize different 
genomic elements in some hybridogenetic taxa. 

Another question about unisexual vertebrates addressed by molecular 
markers concerns mechanistic modes of polyploid formation. More than 60% 
of known unisexual biotypes are polyploid (Vrijenhoek et al. 1989), and two 
competing hypotheses have been advanced to account for the evolutionary ori- 
gins of these forms (Figure 7.18). Under the "primary hybrid origin" hypothe- 
sis, a disruption of meiotic processes in an F, interspecific hybrid leads to the 
production of unreduced diploid eggs whose subsequent fertilization by 
sperm leads to a triploid condition (Schultz 1969). Alternatively, under the 
"spontaneous origin" hypothesis, parthenogenetic triploids might have arisen 
when unreduced oocytes from a diploid non-hybrid were fertilized by sperm 
from a second bisexual species (Cuellar 1974, 1977). As diagrammed in Figure 
7.18, joint comparisons of mitochondrial and nuclear markers permit an empir- 
ical test of these competing possibilities. If a unisexual biotype arose sponta- 
neously from bisexual ancestors and hybridization was involved only second- 
arily the paired homospecific nuclear genomes should derive from the 
maternal parent and, thus, should be coupled with mtDNA derived from the 
same species. Conversely, under a model of primary hybrid origin, the paired 
homospecific nuclear genomes could be coupled with the mtDNA type from 
either of the sexual ancestors, depending on whether the nuclear genome was 
duplicated or added (see below). 
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Figure7.18 Competing scenarios for the origin of triploid unisexual taxa. Each up- 

percase letter represents one nuclear gene set (A or B) from the respective parental 

Species, and lowercase letters in boxes similarly refer to maternally transmitted mt- 

DNA genomes. Smaller ovals represent sperm (tailed) and eggs (non-tailed); asterisks 

indicate eggs that are unreduced (i.e., diploid). In the genome duplication scenario, 

suppression of reduction occurs during an equational division such that the AB hybrid 
i produces AA (or BB) ova. (After Avise et al. 1992c.) 


Cytonuclear genetic analyses for severa] unisexual taxa have provided sup- 
port for the primary hybrid origin hypothesis. For example, the triploid 
parthenogen Cnemidophorus flagellicaudus possesses the mtDNA of C. inornatus, 
but two homospecific nuclear genomes from C. burti, and a similar cytonuclear 
pattern was observed for eight of ten parthenogenetic Cnemidophorus biotypes 
examined (Densmore et al. 1989a; Moritz et al. 1989). Similarly, the triploid gyno- 
genetic fish Poeciliopsis monacha-2 lucida possesses the mtDNA of P. monacha, but 
two nuclear genomes from P. lucida (Quattro et al. 1992b). These results appear to 
refute the “spontaneous origin” scenario (unless the diploid non-hybrid that pro- 
duced unreduced gametes was a male, in which case all bets are off). 

Assuming correctness of the primary hybrid origin scenario, two further 
cytogenetic pathways to triploidy can be distinguished (see Figure 7.18). Under 
the “genomic addition” scenario (Schultz 1969), interspecific F, hybrids pro- 
duce unreduced ova (AB) that then unite with a haploid gamete from one of 
the sexual ancestors to produce allotriploid backcross biotypes AAB or ABB. 
Under the “genomic duplication” scenario (Cimino 1972), suppression of an 
equational cellular division in an F, hybrid could produce unreduced AA or BB 
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ova, which following a backcross to species A or B would produce AAB or ABB 
offspring (autopolyploid AAA or BBB progeny could also result from this 
process, but no self-sustaining populations of autopolyploid unisexual verte- 
brates are known). An important distinction between these pathways is the 
predicted level of heterozygosity in the homospecific nuclear genomes. Het- 
erozygosity should be extremely low under the genome duplication pathway 
(the only variation being derived from post-formational mutations), whereas 
normal heterozygosity is predicted under the genomic addition pathway. At 
least one test of these scenarios has been done: In triploid Poeciliopsis gynogens, 
all assayed strains proved to be heterozygous for homospecific nuclear mark- 
ers at one or more allozyme loci, a result that probably excludes the genome 
duplication hypothesis for these fishes (Quattro et al. 1992b). 

Molecular markers have also contributed to the study of androgenesis, a 
sort of male analogue of gynogenesis, but in this case involving the develop- 
ment of an individual solely under the influence of its paternally derived chro- 
mosomes (i.e., without instructions from the mother’s genetic material). This 
rare process has been demonstrated experimentally in the lab (see Giorgi 1992) 
and studied in nature as well. Using a combination of markers from allozymes, 
chromosomes, and mtDNA, Mantovani and colleagues (1991, 2001; Mantovani 
and Scali 1992) identified wild hybridogenetic strains of Italian stick insects 
(Bacillus rossius-grandii) that had arisen from hybridization between B. rossius 
females and B. grandii males. Hybridogenetic males are infertile, whereas fe- 
males reproduce clonally or hemiclonally. The most surprising discovery, how- 
ever, was that when female B. rossius-grandii were crossed with some sexual 
males, up to 20% of the offspring had the nuclear genetic makeup solely of 
their father. Such androgenetic individuals proved to be diploid (via fusion of 
two sperm heads) and fertile. 

Overall, such detailed understanding of the evolutionary genetics of “uni- 
sexual" taxa was unimaginable prior to the application of molecular markers. 
As recently as 1978, in referring to a hybrid-derived parthenogenetic grasshop- 
per, a leading student of the speciation process lamented that “we are never 
likely to know which species was the female parent” (White 1978b). Since then, 
however, molecular-based parentage determinations for taxa with partheno- 
genetic or related reproductive modes have become routine, and indeed, are 
viewed as merely a starting point for more refined genetic analyses of these 
species’ evolutionary origins and pathways. 


SUMMARY 


1. Various speciation patterns are expected to leave characteristic phylogenetic sig- 
natures on the genomes of recently separated species. By revealing such signa- 
tures, molecular markers have prompted reexamination of several long-standing 
questions in speciation theory: How much genetic change accompanies specia- 
tion? What specific genes are involved? Do most speciations entail severe popula- 
tion bottlenecks? Are rates of speciation correlated with rates of genomic evolu- 
tion? What are the temporal durations of alternative speciation modes? How and 
how often do natural selection and sexual selection play direct (or indirect) roles 
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in the speciation process? How often do speciation events occur in sympatry? 
How prevalent is co-speciation? These and related questions have been provision- 
ally answered for several studied groups, 


. Molecular markers have practical utility in distinguishing closely related taxa, in- 


cluding sibling species that may have gone unrecognized by morphological or 
other appraisals. Molecular diagnoses sometimes involve sibling species of med- 
ical or economic importance. 


. Molecular approaches have enriched the traditional biological species concept 


(BSC) by adding an explicit phylogenetic perspective to discussions of population 
relationships and histories. The fundamental distinction between gene trees and 
species trees has led to the development and elaboration of genealogical concor- 
dance principles for the recognition of subspecies and species. 


. Molecular markers provide powerful means for identifying hybrid organisms and 


for characterizing patterns of introgression. Degrees of hybridization and intro- 
gression have proved to vary along a continuum, from instances of sporadic pro- 
duction of F, individuals only to extensive introgression leading to genetic merg- 
ers between formerly separate taxa. 


. Through cytonuclear analyses, several sources of genetic asymmetry in hybrid 


zones have been theoretically appreciated and empirically documented. Differen- 
tial patterns of introgression can result from inter-locus variation in selection in- 
tensities against alleles on heterologous genetic backgrounds; from Haldane's 
rule, whereby gender-specific fitness differences characterize hybrid organisms; 
from differential mating behaviors by the sexes engaged in hybridization; or from 
other sources of differential gametic exchange. 


. Contrasts between nuclear phylogenies and mitochondrial or chloroplast gene 


trees have identified many instances of “cytoplasmic capture” mediated by past 
introgressive hybridization. This and other lines of genetic evidence suggest that 
reticulate evolution has been fairly common, especially in several plant groups. 


. Genetic markers have revealed that several plant and animal taxa are of hybrid 


origin. Especially impressive have been detailed molecular genetic characteriza- 
tions of the genomic contributions from the parental species involved and of the 
precise cytological pathways leading to the production of hybridization-derived 
unisexual biotypes. 
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All the organic beings which have ever lived on this Earth may be descended 
from some one primordial form. 
C. Darwin (1859) 


Study of the gene at the most fundamental level will soon tell us more about 
the phylogenetic relationships of organisms than we have managed to learn 
in all the 173 years since Lamarck. 

R. K. Selander (1982) 


We proceed now to the traditional provenance of molecular phylogenetics: esti- 
mation of evolutionary relationships among species and higher taxa. After speci- 
ation has been completed and reproductive barriers are in place, the genomes of 
separated taxa are free to diverge further, and many of their DNA sequences typ- 
ically do so in more or less time-dependent fashion (see Chapter 4). Thus, overall 
magnitudes of genetic distance provide at least crude guides to evolutionary 
times since species last shared ancestors. Furthermore, particular molecular 
markers, considered individually and, especially, in combination, provide pow- 
erful characters for delineating clades. 

Many molecular methods (protein electrophoresis, immunological assays, 
DNA restriction analysis, DNA hybridization, and others) have been used to esti- 
mate phylogenies, but in recent years direct nucleotide sequencing has revolu- 
tionized the field. Molecular phylogenetic analyses have been conducted on hun- 
dreds of taxonomic groups and at temporal scales collectively ranging from 
Holocene and Pleistocene separations to pre-Paleozoic divergences. No attempt 


402 Chapter 8 


will be made here to treat this vast literature exhaustively. Instead, diverse 
technical and conceptual approaches will be described in order to highlight 
exciting outcomes in higher-level molecular phylogenetics. 


Rationales for Phylogeny Estimation 


Oddly, recovery of phylogeny per se is seldom the final goal of a phyloge- 
netic analysis. Rather, molecular (or other) estimates of phylogeny are 
desired because of their value as a historical backdrop for interpreting eco- 
logical (Harvey et al. 1995, 1996; Losos 1996) and evolutionary processes 
(Nee et al. 1996b). These appraisals may include assessments of evolution- 
ary rates and patterns in organismal phenotypes (such as particular mor- 
phological, physiological, or behavioral traits), biogeographic configura- 
tions of taxa, frequencies of horizontal genetic transmission and reticulate 
evolution, and countless other biological topics. 


Phylogenetic character mapping 


Closely related taxa often tend to be more similar in phenotype than distant 
taxa, although many exceptions exist due to variable evolutionary rates and 
homoplasy (evolutionary convergences, parallelisms, and reversals). 
Phylogenetic hypotheses (explicit or implicit) underlie virtually all conclusions 
in comparative evolution (Harvey and Purvis 1991). For example, the inference 
that powered flight in mammals is a derived rather than an ancestral condition 
rests upon the restricted phylogenetic position of bats (Chiroptera) within a 
group (Mammalia) whose ancestral forms unquestionably were terrestríal. At 
issue in this case is whether flight evolved once or more than once (conver- 
gently) in bat evolution, a question whose answer depends on whether the 
members of Chiroptera are monophyletic or polyphyletic. 

Whenever a phylogeny is known with reasonable assurance (e.g., from 
secure molecular evidence), the evolutionary origin(s) and directions of 
change in morphological, behavioral, or other organismal features can be 
illuminated by superimposing trait occurrences on the tree. These traits may 
be alternative states of composite attributes (such as wings) or more narrow- 
ly defined characters (ultimately, the genes or nucleotides actually responsi- 
ble for a given character), and evolutionary interpretations must be adjusted 
accordingly (Zink 2002). For example, the broad attribute "flight" is clearly 
polyphyletic in animals, whereas more specific characteristics often associat- 
ed with flight (such as feathers in birds, echolocation in bats, or compound 
eyes in some insects) might each have arisen only once or a few times. 
However, in this chapter I will use "character" in a generic sense. The short- 
hand phrase "phylogenetic character mapping" (PCM) will therefore refer to 
any attempt to match character states with their associated species on a 
cladogram, the purpose being to reveal the evolutionary origins and histories 
of those traits. The cladogram itself is estimated using data that are different 
from and independent of the character states to be mapped. With the advent 
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of molecular approaches, such independent appraisals of phylogeny have 
become commonplace. 

Àn important task for biologists is to understand the history as well as 
the mechanistic operation of adaptive (and other) organismal features 
(Autumn et al. 2002; Frumhoff and Reeve 1994; Givnish and Sytsma 1997), 
including evolutionarily labile and continuously variable ones such as 
body size, age of senescence, or metabolic rate. Quantitative traits of this 
sort pose at least two special challenges for PCM. First, the statistical non- 
independence of character values, which is inherent in the fact that multi- 
ple related taxa have partially overlapping phylogenetic histories (partial- 
ly shared tree branches), must be accommodated (Martins and Hansen 
1997; Richman and Price 1992). If it is erroneously assumed, for example, 
that all body size comparisons in a set of taxa under consideration are phy- 
logenetically independent, then in effect, the degrees of freedom would be 
overestimated in a statistical analysis of evolutionary trends. To circumvent 
such errors, especially when analyzing rapidly evolving quantitative traits, 
accommodation methods such as "independent contrasts" have been 
devised (Box 8.1). 

Second, most complex traits are polygenic (almost by definition), and 
various genes and alleles underlying particular phenotypes can wax and 
wane along the branches of phylogenetic trees (Figure 8.1). The final phe- 
notype may be an outcome of rheostatic or threshold genetic (or environ- 
mental) controls whose particular mechanisms of operation remain 
unknown for most phenotypic attributes and characters. All of this can 
complicate the interpretation of PCM. So too can the fact that a molecular 
phylogeny itself is seldom a perfect estimate of organismal history (as will 
become apparent from some of the case studies to come). For this and other 
reasons, evolutionary conclusions from PCM should always be viewed as 
working hypotheses subject to revision with new evidence. 

Despite these many challenges, PCM is a popular and informative 
endeavor in molecular evolution. Most of the examples considered in this 
chapter will entail slowly evolving qualitative characters (Maddison 1994) 
that are relatively straightforward to map onto a molecular phylogeny. 


ANATOMICAL FEATURES. Bats are distinctive flight-specialized mammals, 
so it might seem almost certain that they originated only once in evolution 
(Figure 8.2A). Under the traditional view, microbats (small, nocturnal, 
echolocating species; suborder Microchiroptera) and megabats (large diur- 
nal forms; suborder Megachiroptera) are monophyletic sister taxa. 
Nonetheless, on the basis of neuroanatomical features, an alternative 
"diphyletic" hypothesis (Pettigrew 1986, 1991) was proposed, according to 
which megabats are phylogenetically closer to primates than to microbats. 
If so, then wings and powered flight could have evolved on at least two 
separate occasions in mammals (Figure 8.2B) once in an ancestor of 
Microchiroptera and again in a lineage leading to Megachiroptera after its 
separation from Primates. Thus, a dilemma exists: Either the shared mor- 





404 Chapter 8 





Figure 8.1 Concept of gradients and thresholds in PCM. Consider a polygenic trait 
that varies as a function of appropriate alleles at many unlinked loci. The diagram 
shows how genetic changes at these loci can produce gradual evolutionary shifts in 
trait expression (intensities of shading). Furthermore, different suites of underlying 
alleles may be responsible for a given phenotypic outcome. A quantitative phenotype 
may also require some threshold number or pattern of alleles for expression. 
Assume, for example, that black in the diagram falls above the required threshold 
(indicating presence of the trait) and non-black falls below the threshold (indicating 
trait absence). "Trait presence" would then have arisen polyphyletically (at positions 
A-E) due to shifts in levels of polygenic support. Thus, a quantitative trait could 
have complex mixtures of both homologous and homoplasious genetic elements. 


phologies associated with powered flight evolved convergently in mega- 
bats and microbats, or shared neuroanatomical traits evolved convergently 
in primates and megabats. 

Several phylogenetic analyses of both nuclear and mtDNA sequences 
have led to firm rejection of the "flying primate" hypothesis for megabats 
and have generally supported a monophyletic scenario for Chiroptera 
(Adkins and Honeycutt 1991; Bailey et al. 1992; Bennett et al. 1988; Mindell 
et al. 1991; Van den Bussche et al. 1998). Thus, when interpreted against the 
backdrop of molecular phylogeny, the original suite of anatomical features 
associated with bat flight probably arose just once. Subsequent molecular 
analyses added an interesting twist, however, by demonstrating that micro- 
bats are paraphyletic with respect to megabats (Figure 8.2C). This finding 
implies that microbats' unique echolocation abilities were lost secondarily 
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Figure 8.2 Alternative scenarios for the phylogenetic relationships of microbats, 
megabats, and primates. Slashes across tree branches indicate hypothesized ori- 
gins of powered flight. 


o 


Primates 


in the megabats, or else that they were gained independently in different 
microbat lines (Springer et al. 2001; Teeling et al. 2000, 2002). 

King crabs of Alaska (genera Lithodes and Paralithodes) are well-known 
decapod crustaceans that look like large versions of "typical crabs," show- 
ing a strongly calcified exoskeleton and a reduced abdomen that folds up 
under the body. By contrast, hermit crabs (approximately 800 species in 
more than 80 genera) have a long, decalcified abdomen that each animal 
coils into a vacant gastropod shell, the hermit's adopted home. Thus, ‘at least 
superficially, the morphology of hermit crabs is in between that of true crabs 
and that of the other major groups of decapod crustaceans: lobsters and 
shrimp. Nonetheless, researchers have long suspected close genealogical 
ties between hermits and king crabs for several reasons (Gould 1992): the 
abdomen of king crabs, although reduced, is asymmetrical, as in hermit 
crabs; some pairs of legs are reduced in both king and hermit crabs, where- 
as all ten legs are fully developed in typical crabs; larval king crabs and 
hermits are remarkably alike in form; and carcinization (evolution of crab- 
like features) appears to have been a recurring theme in hermit crab evolu- 
tion under ecological circumstances in which shells of gastropod snails are 
in limited supply (as in the deep sea). 
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BOX 8.1 independent Contrasts 





For phenotypic characters displaying only a few discrete states that change 
rarely (Such as "wing presence" and "wing absence"), it is usually possible to 
trace trait evolution and deduce ancestral states simply by superimposing 
observed distributions of character states on a known or suspected phylogeny. 
However, for phenotypes that vary continuously (such as body size or life 
span), or those that may be evolutionarily labile, or those that may themselves 
show phylogenetic correlations, such exercises are more challenging. 

The method of independent contrasts (Felsenstein 1985b) is one of several 
statistical procedures that can be used when evaluating the evolution and 
coevolution of continuously varying traits (Bennett and Owens 2002; Garland 
et al. 1992; Harvey et al. 1996; Martins 1995, 1996). These methods entail a 
known phylogeny (often estimated from molecular data; Gittleman et al. 
19962), a range of phenotypes scored in present-day taxa, and a theoretical 
model for phenotypic evolution. For example, Felsenstein's original method 
(several elaborations exist; e.g., McPeek 1995; Pagel 1992, 1994) assumes that 
the phenotypes in question have evolved as if by random Brownian motion, 
and it statistically corrects for phylogenetic non-independence of character 
comparisons by transforming measured data for N extant species into a set of 
N - 1 standardized independent contrasts. In the diagram below (after Martins 
and Hansen 1996), four contrasts are phylogenetically independent (have no 
overlapping tree branches), and two of them involve internal nodes. 


A B Cc D E 





Independent contrasts: 
A-B 
rF-C 
D-E 
G-I 


Ancestor states (or their probabilities) at internal nodes can often be recon- 
structed based on tree topology and on character measurements in contempo- 
rary species (e.g., Schluter et al. 1997). The assignments are often self-evident, 
e.g. when all species stemming from a node share an identical chracter state, or 
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when outgroup comparisons securely polarize the qualitative states examined. 
For highly, variable or quantitative characters, where assignments may be far less 
obvious, the state in the ancestor can be estimated by appropriately averaging 
character values across extant members of that clade (Bennett and Owens 2002). 
Examples of continuously varying traits whose evolutionary histories have 
been analyzed by independent contrasts or related procedures include senes- 
cence and several other quantitative life history features in mammals (Gaillard et 
al. 1994; Gittleman et al. 1996b), musculoskeletal functions in salamanders 
(Lauder and Reilly 1996), and patterns of song evolution and courtship displays 
in birds (Irwin 1996), A computer software package by Purvis and Rambaut 
(1995) is available for “comparative analyses by independent contrasts” (CAIC). 


Phylogenetic analyses of mitochondrial rRNA. gene sequences appear to 
have clinched the case for close genetic links between hermit and king crabs 
(Figure 8.3). Apparently, king crabs arose from a genealogical subset of hermit 
crabs and, indeed, are nested within the hermit genus Pagurus. Furthermore, 
based on molecular clock considerations and extrapolation from fossil and 
geographic evidence, this split from hermit ancestors was estimated to have 
occurred about 13-25 million years ago, thus placing an upper bound on the 
time that transpired during evolutionary loss of the shell-living habit and the 
complete carcinization of king crabs (Cunningham et al. 1992). Evolutionary 
changes in the timing of events during organismal development (hete- 
rochrony) probably account for these dramatic morphological shifts. 
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Fígure8.3 Molecular phylogeny for several species of king crabs and hermit 
crabs, oriented using the brine shrimp (Artemia) as outgroup. "P" designates 
species traditionally placed in the genus Pagurus. Numbers indicate percentages of 
parsimony bootstrap support for various clades. (After Cunningham et al. 1992.) 
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Living cetaceans traditionally were divided into two distinct suborders: 
Odontoceti (echolocating toothed whales and dolphins) and Mysticeti (filter- 
feeding baleen whales). Surprisingly, molecular phylogenies derived from 
sequence analyses of several mitochondrial and nuclear genes have shown 
that carnivorous sperm whales are not most closely related to other toothed 
cetaceans, but rather to baleen whales (Hasegawa et al. 1997; Milinkovitch et 
al. 1994a,b, 1996). Cetaceans as a whole proved to constitute a clade, but the 
relationship of Odontoceti to Mysticeti seems to be one of paraphyly, not 
reciprocal monophyly (Figure 8.4). These findings prompted major reinter- 
pretations of evolutionary transformations among several cetacean features 
(Milinkovitch 1995). For example, PCM analysis indicates that baleen whales 
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Figure 8.4 Molecular phylogeny for representative cetaceans based on mito- 
chondrial rRNA gene sequences. Levels of bootstrap support were near 100% for 
most of the putative clades identified in the parsimony and neighbor-joining analy- 
ses. (After Milinkovitch et al. 1993.) 


& 
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probably lost the morphological apparatus and capacity for echolocation sec- 
ondarily (alternatively, echolocation abilities could have been gained inde- 
pendently by sperm whales and the other toothed cetaceans). 

Molecular analyses also have prompted a reexamination of the phylo- 
genetic position of whales and dolphins within Mammalia (Novacek 1992). 
An evolutionary connection of cetaceans to ungulates (hoofed animals) was 
first suggested more than a century ago, and this notion has been amply 
confirmed by molecular (Goodman et al. 1985; Irwin et al. 1991; Milin- 
kovitch 1992; Miyamoto and Goodman 1986; Southern et al. 1988) as well as 
paleontological and phenotypic evidence (see review in Milinkovitch 1995). 
More controversial has been the question of whether cetaceans are phyloge- 
netically closer to the odd-toed Perissodactyla (a taxonomic order including 
the horse, tapir, and rhinoceros) or the even-toed Artiodactyla (including the 
pig, camel, and deer). Most mitochondrial and nuclear DNA evidence 
favors the Artiodactyla connection (Gatesy 1997; Graur and Higgins 1994). 
Indeed, a recent discovery that whales share several uniquely derived 
SINEs (short interspersed DNA elements) with ruminants and hippopoto- 
muses (Shimamura et al. 1997) suggests that cetaceans are embedded well 
within the artiodactyl lineage (Milinkovitch and Thewissen 1997; Nikaido et 
al. 1999). 

Crossopterygia (“lobe-finned” fishes) are probable evolutionary links 
between Actinopterygia (“ray-finned” fishes) and tetrapods (land verte- 
brates). In 1938, scientists were thrilled by the discovery of a living coelacanth 
(Latimeria chalumnae), a member of a taxonomic subset of lobe-finned fishes 
that formerly was thought to have gone extinct about 65 million years ago. 
Hopes were high that detailed studies of this "living fossil" would resolve 
three long-standing competing hypotheses for the early branching history in 
tetrapod phylogeny: lungfishes (another group of lobe-finned fishes) as sister 
group to coelacanths plus tetrapods (Figure 8.5A); tetrapods as sister group 
to lungfishes plus coelacanths (Figure 8.5B); and coelacanths as sister group 
to lungfishes plus tetrapods (Figure 8.5C). Initial analyses of mitochondrial 
genes (notably cytochrome b and 128 rRNA) provided support for the lung- 
fish + tetrapod clade (Hedges et al. 1993; Meyer and Dolven 1992), and this 
phylogenetic arrangement was used to interpret the histories of 22 morpho- 
logical trait$ (presence versus absence of a glottis and internal nostrils, pelvic 
girdles joined versus unjoined, etc.) early in vertebrate evolution (Meyer and 
Wilson 1990). However, subsequent analyses of nuclear sequences from 285 
rDNA gave significant support to the lungfish + coelacanth clade (Zardoya 
and Meyer 1996a). The contradictory nature of these and other molecular 
findings (Gorr et al. 1991; Normark et al. 1991; Sharp et al. 1991; Stock et al. 
1991) next prompted Zardoya and Meyer (1996b, 1997) to sequence the com- 
plete mtDNA genomes of an extant lungfish (Protopterus dolloi) and a coela- 
canth. These expanded data still were phylogenetically inconclusive 
(Zardoya et al. 1998), however, in part because of strong heterogeneity in 
molecular evolutionary rates across loci. Curole and Kocher (1999) used these 
data and other examples from the literature to urge extreme caution in 
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Figure 8.5 Alternative hypotheses for the phylogenetic root of tetrapods. (After 
Meyer and Wilson 1990.) 


extrapolating from the phylogenies of particular genes (or even of whole 
mtDNA genomes) to the phylogenies of organisms. Nonetheless, the balance 
of current evidence seems to favor scenario C in Figure 8.5 (Meyer and 
Zardoya 2003). 

Since it was first described in 1869, the giant panda (Ailuropoda melanoleu- 
ca) has been a phylogenetic enigma. It generally looks like a bear (family 
Ursidae), but also has many non-bearlike traits: flattened teeth and other 
adaptations associated with a bamboo diet; an opposable "thumb" (really a 
modified wrist bone); lack of hibernation; a bleating voice like a sheep; and a 
karyotype of only 42 chromosomes, compared with bears' 74. More than 40 
morphological treatises reached no consensus on whether giant pandas are 
phylogenetically allied to bears, raccoons, or neither, but molecular appraisals 
apparently solved the mystery (O'Brien 1987). Based on diverse data from 
protein electrophoresis (Goldman et al. 1989), protein immunology, DNA 
hybridization (O'Brien et al. 1985a; Sarich 1973), and DNA sequencing 
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(Hashimoto et al. 1993; Slattery and O'Brien 1995), the giant panda lineage 
originated about 20 mya as an early offshoot of the Ursidae clade (Figure 8.6). 

This molecular phylogeny also prompted a reexamination of the giant 
panda's bizarre karyotype. Using refined laboratory methods that reveal 
details of chromosomal banding patterns, the differences between the giant 
panda's 42 chromosomes and bears' 74 chromosomes were shown to be 
mechanistically superficial, apparently attributable to simple centromeric 
fusions along the giant panda lineage (O’Brien et al. 1985a). Overall, these 
PCM studies on pandas illustrate two broader points: that phylogenies are 
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Figure 8.6 Consensus molecular phylogeny showing the genealogical position 
of the giant panda relative to bears (Ursidae, node A) and raccoons (Procyonidae, 
node B). Lowercase letters represent suggested subfamily designations. For recent 
molecular data and refinements of thought concerning phylogenetic relationships 
of the red panda (another phylogenetic enigma), see Flynn et al. (2000). (After 
O'Brien 1987.) 
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most convincing when supported concordantly by multiple lines of molec- 
ular evidence; and that consensus trees can offer an informative backdrop 
for interpreting the evolutionary histories of problematic molecular-level 
and cellular-level characters, just as they can for morphological traits. 


BEHAVIORAL, PHYSIOLOGICAL, AND LIFE HISTORY FEATURES. Molecular 
phylogenies can also provide a useful backdrop for interpreting the evolu- 
tionary histories of behavioral and other classes of organismal characteris- 
tics. A case in point involves interspecific brood parasitism, or "egg-dump- 
ing," by females into other species' nests with the purpose of duping what 
often then become foster parents. In the Americas, several cowbird species 
are dedicated brood parasites: Molothrus rufoaxillaris specializes on one host 
species; M. aeneus and Scaphidura oryzivora parasitize confamilial genera 
only; and M. ater and M. bonariensis utilize a wide variety of host taxa. PCM 
analyses based on mtDNA cytochrome b gene sequences from nearly three 
dozen species of Icteridae (cowbirds, plus other blackbirds that are non-par- 
asitic) indicate that all the brood parasites listed above form a clade within 
which the single-host specialist (M. rufoaxillaris) branched off earliest. 
Furthermore, the two generalist species (M. ater and M. bonariensis) consti- 
tute a terminal subclade (Lanyon 1992; Lanyon and Omland 1999). These 
results suggest that host specificity is the ancestral condition and that host 
generalization is derived. This conclusion challenged an earlier notion that 
across evolutionary time brood parasites might increasingly specialize on 
fewer host taxa (because at least some host species eventually evolve defen- 
sive mechanisms against the brood parasite). 

Researchers have used molecular PCM approaches to chart the direction 
of behavioral evolution in Lasioglossum sweat bees, a hymenopteran group 
containing both solitary and eusocial members with a diversity of nest archi- 
tectures. Danforth et al. (2003) employed DNA sequences from two nuclear 
genes and one mitochondrial gene to estimate a phylogeny for 42 Lasioglossum 
taxa, onto which they then mapped social behaviors. Salient findings were 
that eusociality had a single origin within the group, but that multiple (about 
Six) reversals to solitary nesting must have occurred within the eusocial clade. 
Results supported the view that eusociality may be hard to evolve, but some- 
what easier to lose (Figure 8.7). In an earlier study of eight species in the sub- 
genus Evylaeus, Packer (1991) had employed multi-locus allozyme data to 
estimate a cladogram onto which he then mapped architectural features of the 
bees’ nests. From this analysis, one notable characteristic—an extended open- 
ing of brood cells during development—was inferred to have originated on 
two separate evolutionary occasions among the species considered. 


Figure8.7 DNA sequence-based phylogeny for sweat bees (Lasioglossum) onto >» 
which have been mapped nesting behaviors. Numbers on branches indicate levels 

of statistical support (Bayesian posterior probabilities) for various groups. (After 
Danforth et al. 2003.) 
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Figure 8.8 Coarse-focus and finer-focus PCM analyses of nesting habits in birds. > 
(A) Phylogenetic distribution of “safe” nests in major avian groups. (After Owens 

and Bennett 1995.) (B) A closer view of the phylogenetic distribution of nesting 

modes in 17 swallow species plus an outgroup. (After Winkler and Sheldon 1993.) 

In each case, the molecular phylogenies were based on DNA hybridization data. 


A PCM analysis of nesting habits in birds further illustrates how 
"behaviors" and "extended morphological phenotypes" can be interrelated. 
Birds' nests can be broadly categorized as "safe" (e.g., in burrows or tree 
cavities) or “open,” and PCM analyses indicate that safe nesting habits have 
arisen multiple times in avian evolution (Figure 8.8A). The swallow family 
Hirundinidae has been of special interest because its approximately 90 
species collectively display perhaps the greatest diversity of nest construc- 
tion modes known in any avian taxonomic family. Some swallow species 
burrow into cliffs, some use adopted natural cavities, and others construct 
mud nests with designs ranging from open cups to roofed and eaved 
abodes. From maps of these nesting modes superimposed on a molecular 
phylogeny for representative swallows (Figure 8.8B), the following were 
deduced: burrowing probably was the primitive nesting mode in the group, 
predating cavity adoption and mud nesting; obligate cavity adoption is con- 
fined mostly to a New World clade; mud-nest construction originated once 
in the family and diversified primarily in Africa; and mud nests "evolved" 
higher complexity through time, from simple cups to fully enclosed abodes. 

Endothermy, the ability to maintain elevated body temperatures by 
metabolic means, is rare among fishes, the only documented examples 
being within large teleosts of the suborder Scombroidei (including tunas, 
mackerels, and billfishes). Did endothermy evolve once or multiple times 
within this assemblage? The character state endothermy, when mapped 
onto a phylogeny estimated from cytochrome b mtDNA sequences, indi- 
cates that this physiological adaptation arose independently at least three 
times within the Scombroidei (Block et al. 1993). Diverse physiological and 
anatomical pathways are involved in these phenotypic convergences 
(Figure 8.9). 

Ascidians (sea squirts) are thought to be primitive chordates, as gauged 
in part by the presence of a notochord in their tadpole-like larvae. 
Phylogenetic analyses based on rDNA sequences (Field et al. 1988) and on 
molecular features of muscle actins (Kusakabe et al. 1992) bolstered this 
view by placing ascidians closer to vertebrates than to invertebrates. Two 
distinctive reproductive/developmental modes are exhibited among the 
2,300 known species of Ascidiacea: solitary forms, and communal forms that 
can asexually generate colonies by budding, strobilation, or regeneration. 
Under an orthodox classification based on characteristics of the branchial 
sac and gonads, ascidians were divided into the orders Enterogona and 
Pleurogona, irrespective of solitary versus colonial lifestyle. If this taxono- 
my reflects phylogeny, then one or both lifestyles probably evolved multiple 
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r— Istiophorus platypterus 
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Tetrapturus albidus 
Tetrapturus audax 
y Tetrapturus angustirostris 
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Figure 8.9 Distribution of endothermy in the phylogeny of Scombroidei marine 
fishes. The phylogeny was estimated from mtDNA sequences. Letters indicate 
three separate origins of endothermy, each with a different physiological basis: A, 
modification of the superior rectus muscle into a thermogenic organ; B, modifica- 
tion of the lateral rectus muscle into a thermogenic organ; and C, use of vascular 
countercurrent heat exchangers in the muscles, viscera, and brain. (After Block et 
al. 1993; see also Block and Finnerty 1994.) 


times. Alternatively, if the distinct lifestyles register a basal phylogenetic 
Split, then the branchial sac and gonadal characters would be homoplasious. 
To distinguish between these hypotheses, Wada et al. (1992) phylogenetical- 
ly analyzed ascidian 18S rDNAs. They showed that ascidian species fell into 
two distinct clades coinciding with orthodox taxonomy. Thus, solitary and 
colonial lifestyles probably were gained or lost independently after the phy- 
logenetic split between Enterogona and Pleurogona. 

Behavioral and morphological traits can be thoroughly intertwined, a 
point further illustrated by male tail lengths in Xiphophorus platyfishes and 
swordtails. These fishes had raised an evolutionary question: Might female 
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mating preferences for males with swordlike tails predate the evolutionary 
origin(s) of such tails? If so, then sexual selection based on mate choice 
could be driven in large part by this "preexisting bias." Reinterpreting orig- 
inal molecular studies by Meyer et al. (1994) and Basolo (1995), Schluter et 
al. (1997) showed through PCM analyses that the ancestral condition of 
Xiphophorus males (and hence the issue of preexisting female mating bias) 
could not be decided definitively, due to rapid evolutionary transitions 
between sword-carrying and swordlessness within the genus (Figure 8.10). 
This study illustrates how ancestral states can be deduced by PCM, but it 
also highlights the statistical or probabilistic nature of the enterprise (D. R. 
Maddison 1995), especially for rapidly evolving traits. 

Plants also have morphologies and behaviors that can be subjected to 
PCM via molecular markers. Ever since Darwin (1877) speculated on the 
matter, androdioecy (in which male and hermaphroditic flowers occur on 
. Separate plants) has been viewed as an intermediate step in the evolution of 
dioecy (separate sexes) from monoecy (hermaphroditism). However, in the 
Datiscaceae, which includes one androdioecious species (Datisca glomerata) 
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Figure8.10 Evolution of males' tail conditions (swords versus swordless) in 25 
species of Xiphophorus (X) and Priapella (P) fishes. Asterisks indicate species in 
which female preference for males with swords has been experimentally docu- 
mented (Basolo 1995). Pie diagrams indicate the relative likelihoods from PCM 
analysis of particular ancestral tail conditions at the nodes. (After Schluter et al. 
1997; based on the molecular phylogeny from Meyer et al. 1994.) 
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plus several dioecious ones, androdioecy was postulated to represent a 
derived condition (Liston et al. 1990). To test this hypothesis, Rieseberg et al. 
(1992) utilized cpDNA restriction site data to construct a molecular phy- 
logeny for Datiscaceae and outgroup taxa. Results suggested that D. glomer- 
ata occupies a derived position relative to dioecious members of the family 
and, hence, that androdioecy in this case evolved from dioecy rather than 
from monoecy. However, later phylogenetic analyses of rbcL and 185 rDNA 
sequences disagreed as to whether dioecy or monoecy was ancestral to 
Datisca, so the issue of evolutionary direction may not yet be fully settled 
(Swensen et al. 1998). 

Darwin (1875) was also well aware of carnivorous plants, which possess 
whole suites of anatomical and physiological features associated with attract- 
ing, trapping, killing, and digesting animals and absorbing their products. 
Given this complexity, it might be supposed that the evolution of carnivory 
would be difficult and rare in the plant world. Nonetheless, when this lifestyle 
was recently mapped onto a phylogeny based on rbcL gene sequences for sev- 
eral dozen taxonomic plant families, carnivory was revealed to be poly- 
phyletic, and even some subcategories, such as "pitcher traps," may have aris- 
en three or more times independently (Albert et al. 1992). On the other hand, 
"snap-traps" proved to be monophyletic (Cameron et al. 2002), as did various 
other detailed components of carnivory that clustered phylogenetically. Thus, 
overall, carnivory in plants displays a mixture of homologous and analogous 
elements. As phrased by Albert et al. (1992), “form is not a reliable indictor of 
phylogenetic relationships among carnivorous planis at highly inclusive lev- 
els (such as trapping mechanism), whereas it appears to be at less inclusive 
ones (such as glandular anatomy).” 

Even microbes and their metabolic capabilities have been the subject of 
PCM. In one interesting example, DeLong et al. (1993) used small-subunit 
rRNA gene sequences to estimate the phylogeny of bacteria that employ 
iron-rich magnetosomes inside their cells to orient to Earth’s geomagnetic 
field as they swim. Findings from this PCM exercise indicated a poly- 
phyletic origin for magnetotaxis: magnetosomes based on iron oxides 
evolved independently from those based on iron sulfides. 

The studies described above merely introduce the logic and scope of 
molecular PCM approaches. Several additional examples are described in 
Table 8.1. 





Biogeographic assessment 


A second rationale for phylogeny estimation is its use in biogeographic 
reconstruction. Just as phylogeographic relationships among conspecific 
populations have been revealed through molecular analyses (see Chapter 6), 
so too have phylogeographic relationships among species and higher taxa. 


VICARIANCE VERSUS DISPERSAL. The potential geographic range of any 
taxon is of course limited by the suitability of environmental conditions. 
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Within that zone of ecological tolerance, realized geographic distributions 
are additionally influenced by historical factors. The "success" of countless 
human-introduced plants and animals around the world is testimony to the 
fact that not all habitats suitable for a species are occupied naturally by that 
species (or, perhaps, that humans have disturbed native habitats so much 
that such introductions often succeed). Thus, whether a species occurs in a 
particular area is a function not only of ecology, but also of historical demo- 
graphic and dispersal patterns, which themselves are influenced by the 
proximity and spatial relationships of habitable environments. The study of 
such historical factors is the focus of molecular biogeography. 

When related taxa show disjunct geographic ranges, two competing 
hypotheses often are advanced to account for these spatial arrangements 
(Box 8.2). Under dispersalist scenarios, such a taxonomic group came to 
occupy its current range through active or passive dispersal across a preex- 
isting geophysical or ecological barrier. Alternatively, under vicariance 
models, the more or less continuous ranges of ancestral taxa were sundered 
by geophysical or ecological events. Such "vicariant events" might have 
been the uplift of a mountain range sundering lowland species, a continen- 
tal breakup that partitioned terrestrial organisms, or subdivision of an ocean 
basin by the rise of an isthmus separating marine taxa. Dispersal and vic- 
ariance models both subscribe to the proposition that speciation is predom- 
inantly allopatric. However, an important prediction of vicariance biogeog- 
raphy (not shared by dispersalist scenarios) is that the cladogram for a tax- 
onomic group should match the historical "area cladogram" of environ- 
ments occupied (Box 8.2). 

Critical tests of specific vicariance scenarios traditionally involved phy- 
logenetic appraisals based on nonmolecular characters (e.g., Cracraft 1986), 
but since the 1980s molecule-based phylogenies have played ever-increasing 
roles. For example, Bermingham et al. (1992) used data from mtDNA restric- 
tion sites to test a vicariance model for the evolution of North American birds 
in the black-throated green warbler complex. It had been proposed that 
episodic glacial advances during the Pleistocene repeatedly fragmented the 
ranges of forest-dwelling birds into eastern and western populations in such 
a way that subsequent speciations produced a series of western endemics, 
each linked phylogenetically to the widespread eastern form (Dendroica 
virens in this case), but at different evolutionary depths (Mengel 1964). 
Molecular data proved not to be fully consistent with this scenario, however, 
suggesting instead that some western warblers in the black-throated green 
complex budded off from one another (perhaps via inter-montane isolations) 
rather than directly from D. virens (Bermingham et al. 1992). 

A similar Pleistocene scenario constituted conventional wisdom for 
anuran evolution in southwestern Australia, where several western 
endemics were hypothesized to have arisen following multiple invasions 
from eastern source stocks. However, immunological studies of albumins 
from more than 20 frog species led to rejection of the multiple-invasion sce- 
nario in favor of a model that includes speciation events within southwest- 
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TABLE 8.1 Molecule-based studies illustrating the-wide diversity of evolutionary: . 
time frames, taxa, and organismal phenotypes that have been the... 
subject of phylogenetic character mapping (PCM) -_ ys 


Basis for molecular phylogeny 


Multiple gene sequences 
Complete genomic sequences 
Chloroplast gene sequences 
Nuclear and cpDNA gene sequences 
rRNA gene sequences 
rDNA sequences and AFLPs 
Nuclear and mtDNA gene sequences 
Nuclear and mtDNA gene sequences 
A large variety of gene sequences 
mtDNA gene sequences 
Nuclear gene sequences 
Nuclear gene sequences 
Nuclear and mtDNA gene sequences 
mtDNA gene sequences 
mtDNA and nuclear gene sequences 
Ribosomal gene sequences 
mtDNA. gene sequences 
mtDNA gene sequences 
mtDNA gene sequences 
mtDNA gene sequences 
mtDNA gene sequences 
mtDNA gene sequences 
mtDNA sequences and 

DNA hybridization 
Mitochondrial gene sequences 
Mitochondrial gene sequences 
Allozymes 
Ribonuclease gene sequences 
Mitochondrial gene sequences 
Mitochondrial gene sequences 








Phenotype mapped 


Photosynthesis in prokaryotes 
Circadian clock genes in prokaryotes 
Endosperm in flowering plants 
Nectar spurs in columbine plants 
Position of the ovary in herbaceous plants 
Flower pollination by hummingbirds 
Fruiting body structures in fungi 
Symbiosis of termites and fungi 
Venom composition in cone snails 
Mutualism between ants and aphids 
Agriculture in ants 

Wing features in stick insects 

Social behavior in thrip insects 
Courtship songs in lacewing insects 
Features of microsatellite loci in wasps 
Compound eye in arthropods 
Eusociality in marine shrimp 
Egg-mimic structures in male fish 
Brood pouches in male-pregnant fish 
Placentas in live-bearing fish 

Sexual isolation behaviors in fish 
Müllerian mimicry in frogs 
Flightlessness in birds 


Polygynous mating in bixds 
Plumage features in orioles 

Dietary habits in Galápagos finches 
Ruminant digestion in mammals 
Human domestication of horses 
Coat colors in wild mice 


ern Australia (Maxson and Roberts 1984; Roberts and Maxson 1985). 
Furthermore, based on molecular evidence, many of these speciation events 
appeared to predate the Pleistocene significantly. 

Considerable molecular attention has been devoted to the possible roles 
of Pleistocene ice ages in vicariant population separations and speciation in 
plants (e.g., Comes and Kadereit 1998) and animals (Hewitt 1996), both in 
temperate regions (Knowles 2001) and in the tropics (Moritz et al. 2000; 
Patton et al. 1994a,b). Based on molecular phylogenies for literally hundreds 
of species (Avise 2000a; see Chapter 6), it is now abundantly clear that many 
conspecific populations were isolated and began differentiating in 
Quaternary refugia, sometimes achieving the status of taxonomic species by 
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Reference 





Raymond et al. 2002 

Dvornyk et al. 2003 

Geeta 2003 

Hodges and Arnold 1994 
Kuzoff et al. 1999 

Beardsley et al. 2003 

Hibbett and Binder 2002 
Aanen et al. 2002 

Olivera 2002 

Shingleton and Stern 2002 
Mueller et al. 1998; Currie et al. 2003 
Whiting et al. 2003 

Morris et al. 2002 

Wells and Henry 1998 

Zhu et al. 2000 

Oakley and Cunningham 2002 
Duffy et al. 2000 

Porter et al. 2002; Porterfield et al. 1999 
Wilson et al. 2001, 2003 
Reznick et al. 2002 

Mendelson 2003 

Symula et al. 2001 

Cubo and Arthur 2001 


Searcy et al. 1999 

Omland and Lanyon 2000 

Yang and Patton 1981; Schluter et al. 1997 
Jermann et al. 1995; Schluter et al. 1997 
Vila et al. 2001 

Nachman et al. 2003 


the present, and sometimes not (Avise and Walker 1998; Klicka and Zink 
1997). Thus, for contemporary distributions of closely related taxa, a recur- 
ring phylogeographic theme is vicariant separation(s) during the Pleistocene 
and earlier, followed by post-Pleistocene dispersal to the current configura- 
tions (Bernatchez and Wilson 1998; Hewitt 1999, 2000; Jaarola and Tegelstróm 
1995, 1996; Taberlet et al. 1998). 

Vicariance versus dispersal questions also apply in the marine realm. For 
example, closely related pairs of more than 50 tropical shoreline-restricted 
fish taxa have disjunct distributions in the Pacific Basin, being separated by 
at least 5,000 km (the distance between the closest islands in the central ver- 
sus eastern Pacific). Long-distance dispersal, probably via planktonic larvae 
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BOX 8.2 Vicariance versus Dispersal in Biogeography 





Disjunct distributions of related taxa are biogeographically intriguing. 
Traditionally, disjunct ranges were interpreted to evidence dispersal across pre- 
existing geographic or ecological barriers, usually from a biogeographic “cen- 
ter of origin" where a given taxonomic group presumably originated 
(Darlington 1957, 1965). Dispersalist explanations sometimes became quite 
strained, however, as, for example, in accounting for global arrangements of 
relatively sedentary creatures such as the flightless ratites (ostriches, rheas, 
emus, and their presumed relatives). Vicariance hypotheses challenge disper- 
salist ones by proposing that disjunct ranges result from historical sunderings 
of ancestral taxa by geophysical events, without the need to invoke improbable 
long-distance dispersal or “sweepstakes” (rare and lucky) colonizations 
(Simpson 1940). 

Vicariance biogeography as a formal discipline (Humphries and Parenti 
1986; Nelson and Platnick 1981; Nelson and Rosen 1981; Rosen 1978) grew 
out of two developments in the 1960s and 1970s (Wiley 1988): a growing 
appreciation, based on the study of plate tectonics and other geologic process- 
es, that the Earth’s features were not fixed, but rather were historically. 
dynamic; and the growth of cladistics and parsimony (see Chapter 4), which 
armed researchers with new conceptual outlooks and analytical toals for phy- 
logeny reconstruction. 

Under stríct vicariance hypotheses (in contrast to dispersalist expecta- 
tions), cladistic relationships among related disjunct taxa should mirror faith- 
fully the historical relationships among the geographic regions occupied. 
Indeed, one major goal of vicariance biogeography is to answer questions 
about the biological history of the real estates themselves (Wiley 1988): for 
example, “Is Cuba more closely related to Hispaniola than to Jamaica?" This 
is accomplished through comparative searches for congruent patterns ("gen- 
eralized tracks") in organismal phylogeography, while also realizing that 
most biotic communities will have had both "branching" (vicariant) and 
"reticulate" (secondary dispersal) events in their histories (Enghoff 1995; 
Ronquist 1997), 
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traveling west to east with equatorial currents, may have connected popula- 
tions in these areas fairly recently. Alternatively, fish taxa inhabiting the east- 
ern Pacific might be vicariant relicts of former worldwide ancestors in the 
Tethys Sea, an ocean that separated Laurasia from Gondwana following the 
breakup of the supercontinent Pangaea more than 90 mya, during the 
Mesozoic (McCoy and Heck 1976). 

For several such fish groups on opposite sides of the Pacific, as well as 
some invertebrates with similar distributions, molecular data have clearly 
eliminated the ancient vicariance hypothesis by demonstrating relatively 
small genetic distances that are more consistent with recent (« 3 mya) or 
sometimes ongoing genetic contacts across the Pacific basin (Baldwin et al. 
1998; Palumbi 1994; Palumbi et al. 1997; Rosenblatt and Waples 1986). 
Indeed, even on a circumglobal scale, phylogenetic analyses of mtDNA in 
several taxonomic assemblages of oft-cryptic marine species and genera 
have yielded estimated evolutionary separation dates falling within the last 
0-20 million years (Bermingham et al. 1997; Bowen and Grant 1997; Bowen 
et al. 2001; Colborn et al. 2001; see Aoyama et al. 2001, for a possible excep- 
tion involving Anguilla eels). Of course, even if most of these marine con- 
geners separated rather recently, that does not negate the likelihood that 
ancient vicariant events (perhaps related to the Tethys Sea) had genetic 
effects as well. Some populations sundered in ancient times may since have 
evolved into what today are recognized as higher taxa (often with second- 
ary changes in geographic distributions resulting in range overlaps of some 
of the constituent species). 

Thus, one important advantage of molecular approaches is that tempo- 
ral issues (based on molecular clock considerations) can be examined in 
addition to cladistic assessments per se. In another example, extensive 
molecular data sets have been employed in conjunction with other biogeo- 
graphic evidence to assess the evolutionary times of separation between a 
wide variety of fishes, amphibians, reptiles, birds, and mammals in the 
Caribbean region (see review in Hedges 1996). Geologic evidence indicates 
that the Greater Antilles islands formed in close proximity to North and 
South America about 110-130 mya, but via plate tectonics had begun sepa- 
rating from these mainlands by the late Cretaceous (80 mya). Much of the 
present West Indian biota might reflect these ancient proto-Antillean vicari- 
ant separations. Alternatively, post-vicariant dispersal might account for the 
presence of related taxa on the various islands. 

Hedges' (1996) analyses proved inconsistent with the hypothesis of 
ancient vicariance for most of this biota in two regards (Figure 8.11). First, 
in phylogenetically independent comparisons of numerous pairs of related 
island taxa, or island versus mainland forms, genetic distances showed a 
large variance, suggesting widely differing colonization dates (as predicted 
under most dispersalist scenarios). Second, most of the estimated diver- 
gence dates, based on molecular clocks, fell well within the Cenozoic, long 





gą Chapter? 
Asteroid impact 


proto-Antilles 
Geologic separation from mainland Cuba/Hsp/PR 











West Indies/ 


mainland ecc o 





Cuba/Jamaica 





Jamaica / Hispaniola 


Cuba/ Hispaniola 





Hispaniola / 
Puerto Rico ; : 
90 60 30 9 

Millions of years ago 


i i I i [i i 4 ! 


140 80 20 
Albumin immunological distance 


Figure 3.11 Empirical molecular tests of dispersalist versus vicariance hypothe- 
ses for the origins of terrestrial vertebrates in the Caribbean region. Shown are 
immunological distances (in albumin proteins) and associated estimates of evolu- 
tionary separation times (under an evolutionary clock) between varíous island and 
mainland species (each comparison is indicated by a dot). Shaded areas indicate 
dates of faunal separation predicted under vicariance scenarios based on geologic 
events. (After Hedges et al. 1992b.) : 


after the postulated vicariant geographic events that otherwise might have 
been involved. Closer phylogenetic inspections further revealed that South 
America was the original source for most island lineages of amphibians 
and reptiles (nonvolant animals that presumably arrived by flotsam drift- 
ing on oceanic currents), whereas North America and Central America 
were the dispersal sources for most birds and bats (volant animals that sim- 
ply flew to the islands from these more proximate continental regions). 

Many more phylogeographic insights have been gleaned through these 
and other molecular comparisons of Caribbean faunas, both vertebrate and 
invertebrate (Burnell and Hedges 1990; Hedges 1989; Hedges et al. 1991; 
Klein and Brown 1995; Losos et al. 1998; Seutin et al. 1994; Shulman and 
Bermingham 1995). For example, the six species of Anolis lizards on Jamaica 
constitute a clade that radiated from a colonizer that probably arrived about 
14 mya (Hedges and Burnell 1990). Similarly, the nine species of Jamaican 
land crabs (unique among all the world's crabs in providing active brood 
care for larvae and juveniles) apparently arose in situ from a common ances- 
tor that inhabited the island about 4 mya (Schubart et al. 1998). 
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A remarkable historical instance of long-distance dispersal was inti- 
mated by molecular findings for two spatially disjunct species of annual 
plants: the diploid Senecio flavus of the Saharo-Arabian and Namibian 
deserts in Africa and the tetraploid S. mohavensis of the Mojave and 
Sonoran deserts in North America. In allozyme assays, these two species 
proved to be remarkably similar (Nei's I = 0.95), a genetic result that not 
only affirmed their traditional congeneric status based on their gross mor- 
phology, but also implied that recent intercontinental dispersal accounts for 
these species' current distributions (Liston et al. 1989). How this "sweep- 
stakes dispersal" (see Box 8.2) occurred remains a mystery, but one possi- 
bility is that the plants' sticky seeds were transported by migrating or lost 
birds. 

Much deeper along the temporal scale, considerable attention has been 
devoted to possible phylogeographic consequences of the Mesozoic 
breakup of the world's supercontinents and the subsequent movements of 
landmasses by plate tectonics. For example, characiform fishes consist of 
about 1,200 living species in 16 families, nearly all restricted to fresh waters 
of Africa and South America. By assaying slowly evolving rDNA 
sequences, Ortí and Meyer (1997) identified three ancient trans-Atlantic 
clades, each of which apparently was sundered by the 90-mya vicariant dis- 
junction of these continents. This ancient Gondwanaland breakup has like- 
wise been inferred to have Jeft deep and concordant molecular phylogeo- 
graphic footprints on two other large taxonomic assemblages of circum- 
tropical freshwater fishes, Cichlidae and Aplocheiloidei (Figure 8.12), as 
well as on circumtropical birds in the orders Psittaciformes (parrots) and 
Piciformes (including barbets and toucans) (Miyaki et al. 1998; Sibley and 
Ahlquist 1986, 1990). 

Overall, however, it is probably unwise to dichotomize dispersal and 
vicariance too sharply (except for heuristic purposes) because both process- 
es play key roles in many instances. This point is forcefully illustrated by the 
global phylogeography of ratite birds (ostriches, rheas, emus, kiwis, and 
their allies). Molecular and other evidence have often been interpreted as 
consistent with ancient lineage separations and Cretaceous vicariance 
events (the continental breakup of Gondwanaland) as accounting for the 
current distribution of these flightless birds on southern landmasses 

(Cooper 1997; Lee et al. 1997; Prager et al. 1976; Sibley and Ahlquist 1981; 
Stapel et al. 1984; see Hárlid et al. 1998 for a dissenting view). Complete 
mtDNA genomic sequences for extant ratite species recently appear to have 
confirmed the general outlines of this view (Haddrath and Baker 2001). 
However, molecular phylogeographic findings have also demonstrated at 
least two instances of dispersal in ratite evolution: a secondary colonization 
of New Zealand by kiwis (Cooper et al. 1992, 2001), and a colonization of 
Africa by an ostrich ancestor that probably arose in South America about 60 
mya before dispersing to Africa via the Northern Hemisphere (Haddrath 
and Baker 2001; van Tuinen et al. 1998). 
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Figure 8.12 Maximum parsimony trees for two speciose groups of circumtropi- 
cal freshwater fishes: Aplocheiloidei (Murphy and Collier 1997) and Cichlidae 
(Farias et al. 1999; see also Streelman et al. 1998; Zardoya et al. 1996). Trees were 
derived from mtDNA sequences. (Modified from Avise 2000a.) 
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COMMON ANCESTRY VERSUS CONVERGENCE. In molecular PCM analyses, a 
typical goal is to distinguish between common ancestry and convergent 
evolution as the source of particular phenotypic features shared between 
extant taxa. These issues often come into sharp focus in a geographic con- 
text. For example, many Australian marsupials bear general morphological 
or behavioral resemblance to particular placental mammals elsewhere in the 
world. Examples of such convergences include kangaroos and deer, the 
Tasmanian wolf and placental carnivores, and bandicoots and rabbits. Such 
cases of "ecological equivalence" are “known” to register evolutionary con- 
vergence because all of the marsupials retain detailed and almost certain 
signatures of common ancestry, most notably a pouch (marsupium). 
Furthermore, the suspected monophyly of marsupials, nearly all of which 
live in the Southern Hemisphere, makes biogeographic sense. Australia 
became isolated from other landmasses during the early and middle 
Tertiary period (ca. 30-60 mya), and the marsupials there adaptively radiat- 
ed to fill many ecological niches occupied by placental mammals elsewhere. 
Many cladistic details of this evolutionary radiation of marsupials have 
been worked out using a panoply of molecular methods, including protein 
immunology (Baverstock et al. 1987; Kirsch 1977), allozyme analyses 
(Sinclair 2001), DNA hybridization (Kirsch et al. 1990; Springer and Kirsch 
1989, 1991; Springer et al. 1990), microsatellite analyses (Pope et al. 2000), 
and direct sequencing of nuclear and mitochondrial genes (Belov et al. 2002; 
Colgan 1999; Janke et al. 2002; Springer et al. 1998). 

One of the most dramatic findings in early molecular phylogenetics 
involved what may have been an analogous evolutionary radiation of 
Australian songbirds. European ornithologists described and named the 
Australian endemics only after having classified much of the Old World avi- 
fauna. Many of Australia's species appeared to fit neatly into taxa previous- 
ly established. For example, Australian warbler-like birds were placed into 
Sylviidae (Eurasian warblers) Australian flycatchers into Muscicapidae 
(Afro-Eurasian flycatchers), treecreepers into Certhiidae (Eurasian- 
American creepers), and sitellas into Sittidae (Holarctic nuthatches). More 
than a century later, however, molecular analyses (based initially on DNA 
hybridization) of hundreds of avian species worldwide indicated that these 
traditional assignments were incorrect (Sibley 1991; Sibley and Ahlquist 
1986, 1990). 

Instead, based on the molecular evidence, many Australian songbirds 
seemed to stem from a common ancestor on the continent, leading Sibley 
and colleagues to conclude that oscine songbirds of the world (those with a 
complex syrinx or voice box) constitute two major phylogenetic lines: sub- 
order Passerida, which evolved in Africa, Eurasia, and North America, and 
suborder Corvida, which originated in the Australian region (Figure 8.13). 
These findings were remarkable. They overturned conventional taxonomic 
wisdom for birds, indicated rampant convergence of general phenotype 
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Figure 8.13 Phylogeny of the oscine songbirds based on DNA hybridization 
data. Two major historical groups were postulated: one (Corvida) tracing to an 
Australian root (some of whose members, such as crows and jays, secondarily radi- 
ated elsewhere in the world) and the other (Passerida) tracing to a non-Australian 
origin. (After Sibley and Ahiquist 1986.) 


between various members of Corvida and Passerida, and suggested that the 
phylogeographic radiation of a major segment of the Australian avifauna 
had roughly paralleled that of mammals native to Australia. The general 
thrust of these phylogenetic suggestions has been supported in subsequent 
analyses of other types of molecular data (e.g., Ericson et al. 2000, 2002; 
Lovette and Bermingham 2000), albeit not without particular points of con- 
tention (e.g., Barker et al. 2002). 

More surprises about the early (e.g., ordinal-level) evolution of avian 
lineages are emerging from extensive DNA sequence analyses. Much of this 
molecular effort was prompted by a radical reinterpretation of the avian fos- 
sil record by Feduccia (1995, 1996), who hypothesized that all modern birds 
might trace back to one or a few fortunate lineages that survived the mass 
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extinction at the Cretaceous/Tertiary (K/T) boundary 65 mya. However, 
recent molecular dating studies appear not to support this bottleneck sce- 
nario (Cooper and Penny 1997). Instead, they indicate that many avian lin- 
eages now recognized as different taxonomic orders originated in the early 
to mid-Cretaceous and survived to traverse the K/T boundary (Hedges et 
al. 1996; van Tuinen and Hedges 2001; van Tuinen et al. 2000; Waddell et al. 
1999). These findings also support theories that ancient continental 
breakups were probably an important driving force in early avian evolution 
(Ericson et al. 2002; Paton et al. 2002). 

Other molecular discoveries have prompted major revisions of thought 
concerning deep evolutionary branches in the global phylogeny of placen- 
tal mammals (Waddell et al. 1999). In particular, analyses of many mito- 
chondrial and nuclear genes plus protein sequences indicate that an ancient 
and previously unrecognized lineage of placental mammals originated and 
diversified on the African continent (Nikaido et al. 2003; Springer et al: 1997; 
Stanhope et al. 1998b; van Dijk et al. 2001). This "Afrotheria" clade, which 
includes such unlikely evolutionary cousins as elephants, hyraxes, aard- 
varks, golden moles, and tiny elephant shrews (named for their long 
snouts), appears to be one of about four deep molecular branches of pla- 
cental mammals. Another branch includes rodents, rabbits, and primates; a 
third includes bats, hoofed animals, and most carnivores; and the fourth 
branch is composed of armadillos, sloths, and their allies (Madsen et al. 
2001; Murphy et al. 2001). Furthermore, based on molecular clock consider- 
ations and earliest fossil appearances, several deep mammalian lineages 
(and probably many others nested within them) are now believed to have 
separated well before the end of the Cretaceous (Hedges 2001; Hedges et al. 
1996; Kumar and Hedges 1998; Springer et al. 2003). These findings chal- 
lenge earlier hypotheses that the primary diversification of mammalian lin- 
eages occurred after the K/T mass extinction, and they also support theories 
that continental configurations have been a key driving force in early mam- 
malian evolution (Delsuc et al. 2002). 


RECENT ISLANDS, ANCIENT INHABITANTS. Another spectacular evolution- 
ary radiation studied via molecular markers is that of Drosophilidae flies in 
the Hawaiian Islands, the native home to an estimated 800 or more species 
(a remarkable count because the whole archipelago accounts for only 0.01% 
of Earth's total land area). These flies traditionally are divided into two 
groups—the drosophiloids and scaptomyzoids—that were postulated to 
derive from one or two founder populations of unknown continental origin 
(Throckmorton 1975). From molecular analyses, it now appears likely that 
all Hawaiian Drosophila and Scaptomyza form one large monophyletic group 
(Kwiatowski and Ayala 1999). Geologically, the most ancient of the major 
islands above water today ís only 5 million years old. Was the incredible 
proliferation of the drosophilid clade in the Hawaiian archipelago truly 
accomplished within such a short evolutionary time span? 


Ce 
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From immunological comparisons of larval hemolymph proteins, 
Beverley and Wilson (1985) provided some of the first evidence that various 
lineage separations in these Hawaiian flies vastly predate the volcanic emer- 
gence of present-day islands, a contention that soon gained further support 
from data on nuclear and mitochondrial gene sequences (DeSalle 1992a,b; 
Thomas and Hunt 1991). Apparently, extant Hawaiian drosophilids stem 
from a colonist that landed on the archipelago about 30 mya (Kambysellis 
and Craddock 1997). This paradox has a resolution: The current land archi- 
pelago contains merely the latest in a succession of islands dating back more 
than 70 million years, remnants of which now exist as drowned seamounts 
or low atolls northwest of the current chain. Thus, according to the molecu- 
lar findings, many speciations probably occurred tens of millions of years 
ago on islands no longer in existence as flies island-hopped to newly arisen 
terrain. Similar speciation processes are continuing today, as evidenced by 
molecular and other data on particular Drosophila subgroups, such as the 
picture-winged flies (Carson 1976, 1992; Piano et al. 1997). On the other 
hand, molecular analyses of about 20 other taxonomic groups on the 
Hawaiian Islands (mostly various native birds and plants) have typically 
yielded estimates of divergence times falling within the past 5 million years 
(see review in Price and Clague 2002). 

Analogous questions apply to biotas on another volcanic island chain 
that has figured prominently in evolutionary studies. All of the present-day 
Galápagos Islands are less than 3 million years old, an age considered by 
some researchers to place an outer bound on the maximum time over which 
the exuberant evolution on the archipelago must have taken place (Hickman 
and Lipps 1985). Another possibility, however, is that many speciation 
events transpired on former islands before they sank beneath the sea 
(Christie et al. 1992). Molecular phylogenetic data have helped to decide 
between these competing possibilities. For example, small genetic distances 
within the 14-species clade of Darwin's finches (Geospizinae) are consistent 
with the hypothesis of an in situ evolutionary radiation within the 3-million- 
year time frame of the modern islands (Grant and Grant 2003; Petren et al. 
1999; Polans 1983; Sato et al. 1999, 2001; Yang and Patton 1981). Similarly, 
molecular findings on the Galápagos tortoises (Geochelone nigra) indicate 
that morphologically diverse populations of this species are monophyletic 
and just a few million years old, having originated from the Chaco tortoise 
(G. chilensis) of mainland South America (Caccone et al. 1999, 2002). 

On the other hand, larger genetic distances suggestive of separation 
dates of greater antiquity were reported between the marine iguana 
(Amblyrhynchus cristatus) and land iguanas (Conolophus pallidus and C. sub- 
cristatus) of the Galápagos (Rassman 1997; Wyles and Sarich 1983). Perhaps 
these genera were phylogenetically separated on now-drowned islands (as 
their relatively ancient yet monophyletic molecular status might suggest), 
but the possibility that they trace to two or more separate invasions of the 
islands cannot be excluded. Similar multi-colonization scenarios have been 
advanced to account for large genetic distances observed among some other 
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lizards and rodents native to the Galapagos (Lopez et al. 1992; Patton and 
Hafner 1983; Wright 1983). The broader message is that extensive taxonom- 
ic sampling and secure molecular phylogenies are often necessary to draw 
definitive conclusions about the origins and subsequent evolutionary histo- 
ries of island biotas (Emerson 2002). 


Academic pursuit of genealogical roots 


In truth, many studies in molecular phylogeny probably are initiated by 
sheer intellectual curiosity about a particular group’s ancestry. Most sys- 
tematists have a favorite taxon (be it fishes or fungi), the phylogenetic 
understanding of which can become an obsession. Although it would be 

i presumptuous to choose particular examples in molecular systematics as 

i being of special inherent interest, the Primate assemblage that includes 
humans stands out as worthy of mention, because no other topic in molec- 
ular phylogenetics and evolution has attracted so much attention (Donnelly 
and Tavaré 1997; Lewin 1999; Tashian and Lasker 1996). 

Traditionally, Homo sapiens has been classified as the sole extant species in 
Hominidae, a taxonomic family belonging to the superfamily Hominoidea, 
which also includes the Asiatic apes [gibbons (Hylobates), siamangs 
(Symphalangus), and orangutans (Pongo) and the African apes [gorillas 
(Gorilla) and chimpanzees (Pan)]. Our closest relatives outside the 
Hominoidea clearly are Old World monkeys (Cercopithecoidea). Within 
Hominoidea, conventional wisdom has been that humans’ closest living rela- 
tives are the great apes of Africa (Pongidae). 

Prior to the availability of molecular data, a popular paleontological sce- 
nario was that the lineage leading to humans split from a line leading to goril- 
las and chimpanzees about 15-30 mya (see Patterson 1987). In 1967, in a stun- 
ning report based on immunological assays (Figure 8.14), Sarich and Wilson 
challenged this belief by concluding that the phylogenetic lineage eventuat- 
ing in Homo sapiens separated from that leading to the African apes only 
about 5 mya. Furthermore, the African apes might not form a distinct clade, 
because lineages leading to chimpanzee, gorilla, and human constituted an 
unresolved phylogenetic “trichotomy” in these immunological analyses. 

Nearly 40 years and dozens of molecular studies later, both of these con- 
clusions have been largely vindicated. It is now generally accepted that the 
proto-human lineage separated about 4-6 mya, and that humans, chim- 
panzees, and gorillas are related roughly equidistantly to one another (but 
see below). These conclusions grew from an early consensus of molecular 
information from protein immunology, protein electrophoresis, amino acid 
sequencing, DNA hybridization, restriction site analyses, and nucleotide 

| sequencing from mtDNA as well as many nuclear genes and noncoding 
! regions. Some of pioneering studies were by Bruce and Ayala (1979), 
Caccone and Powell (1989), Goodman et al. (1990), Hasegawa (1990), 
Miyamoto and Goodman (1990), Nei and Tajima (1985), Sibley et al. (1990), 
and Williams and Goodman (1989). Prior competing scenarios from mor- 
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Figure 8.14 One of the first molecular-based estimates of the phylogenetic posi- 
tion of Homo sapiens within the primates. This phylogeny, based on immunologi- 
cal distances in albumins, revolutionized thought about human origins, and its 
major features generally have been confirmed with much additional molecular evi- 
dence. (After Sarich and Wilson 1967; see also Goodman 1962.) 


phology and fossil evidence were also reinterpreted, at least partially, to 
accommodate these molecular revelations (Andrews 1987; Pilbeam 1984). 
Recent genome-scale comparisons of human and chimpanzee sequence data 
(nuclear and mitochondrial, respectively) can be found in Chen et al. (2001) 
and Ingman et al. (2000). 


Much attention has been focused on resolving the human-chim- 


panzee-gorilla phylogenetic trichotomy. Early discussions centered on the 
suitability of various classes of molecular data as well as appropriate methods 
of statistical analysis (e.g., Kishino and Hasegawa 1989; Nei et al. 1985; Saitou 
and Nei 1986; Templeton 1983). A consensus gradually emerged favoring a 
human-chimpanzee clade, to which the gorilla represents a close sister line- 
age (Hasegawa 1990; Horai et al. 1992; Li and Graur 1991; Ruvolo et al. 1994; 
Sibley and Ahlquist 1984; Williams and Goodman 1989). This sentiment 
remains preeminent today, albeit with an added recognition that the temporal 
closeness of nodes in the phylogenetic "trichotomy" means that some (a 
minority) of gene trees might actually ally humans with gorillas rather than 
with chimpanzees (Pääbo 2003). In any event, from the extensive molecular 
analyses of great ape relationships, at least two salient points have emerged: 
humans are phylogenetically very close to our primate relatives, and the tra- 
ditional placement of Homo sapiens in a monotypic family (Hominidae) 
reflects anthropocentric bias more than objective phylogenetic reality. 
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Some Special Topics in Phylogeny Estimation 


Another way to organize thought about the plethora of phylogenetic stud- 
ies i8 to focus on the diversity of molecular methodologies employed. This 
section will highlight several additional pioneering approaches that have 
had significant effects on phylogenetic analyses of particular taxa, or at pat- 
ticular temporal depths in the evolutionary hierarchy. 


DNA hybridization and avian systematics 


One of the first mega-analyses in molecular phylogenetics involved a 
decade-long series of DNA hybridization studies by Charles Sibley and Jon 
Ahlquist on more than 1,700 avian species representing nearly all of the 171 
taxonomic families of birds. The collective result was what became known 
in ornithological circles as "The Tapestry," an extended estimate of avian 
phylogeny that in its printed version spanned 42 pages in the authors’ sum- 
mary tome (Sibley and Ahlquist 1990). This work was revolutionary not 
only for its unprecedented taxonomic sampling, but also for its many 
provocative conclusions. One of these was the proposed evolutionary radi- 
ation of the Australian Corvida, described above. Another concerned New 
World and Old World vultures, which Sibley and Ahlquist (1990) claimed 
were not closely related. This result was generally confirmed (albeit with 
differences in detail) in subsequent direct analyses of cytochrome b mtDNA 
sequences (Seibold and Helbig 1995; Wink 1995), thus implying that vul- 
tures and their carrion-feeding lifestyle are polyphyletic in origin. 

Several other unorthodox conclusions from The Tapestry are described 
in Table 8.2. Although Sibley and Ahlquist's approach and conclusions were 
by no means met with universal approbation (e.g., Cracraft 1992; Cracraft 
and Mindell 1989; Sarich et al. 1989), their findings continue to warrant seri- 
ous consideration for several reasons: the DNA hybridization approach is 
presumably powerful, at least in theory; most branching patterns in The 
Tapestry do agree well with traditional ornithological thought based on 
morphological or other evidence; and some (but not all) of even the most 
controversial aspects of The Tapestry have gained additional support from 
subsequent direct sequencing analyses of mitochondrial or nuclear genes 
(see Newton 2003; Sorenson et al. 2003 for additional examples). 

Regardless of how these various phylogenetic hypotheses are eventual- 
ly resolved, an important point regarding this immense effort by Sibley and 
Ahiquist should not be lost: Their study was among the first sweeping 
attempts to capitalize explicitly on a “common yardstick” rationale (see 
Chapter 1) in molecular phylogenetics. These researchers promulgated a 
standardized molecular metric (AT,,, presumably related to evolutionary 
time) that could serve as a potentially universal measure of the magnitude of 
evolutionary separation among all avían (as well as other) taxa, be they 
hummingbirds or ratites. Indeed, Sibley and Ahiquist went further by advo- 
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TABLE 8.2 Some unorthodox and provocative conclusions about avian phylogeny from 
DNA hybridization studies by Sibley and Ahlquist (1990) 





Provocative conclusion from DNA hybridization Follow-up molecular support 


Loons (Gaviiformes) and grebes Hedges and Sibley 1994 
(Podicipediformes) are not close 
evolutionary relatives. 


Game birds (Galliformes) and waterfowl Mindell et al. 1997; van Tuinen 
(Anseriformes) are one another’s closest and Hedges 2001 
living relatives, albeit anciently separated 
(ca. 90 mya) 

New World barbets (in Capitonidae) are Lanyon and Hall 1994 
evolutionarily closer to toucans (Ramphastidae) 
than to Old World barbets, 

The traditional order Pelecaniformes is not Hedges and Sibley 1994; 
monophyletic, but instead includes at least Siegel-Causey 1997 


34 highly distinct clades, often with closest 
living relatives elsewhere. 


The American wrentit (Chamaea fasciata) is Shirihai et al. 2001 
related to Old World Sylvia warblers. 
New World quails (Odontophoridae) are a Kornegay et al. 1993 


sister clade to the pheasants rather than 
being embedded within the Phasianidae. 


Note: Each example listed here has gained some support from subsequent molecular analyses often based on 
direct DNA sequencing. However, not all of Sibley and Ahlquist's conclusions have been upheld by subse- 
quent DNA sequence data. For example, some of their proposed relationships among species of Gruiformes 
now appear to be questionable (Houde et al. 1997), as do particular details concerning relationships of some 
of the passerine species in the Australian corvid radiation (see text). 


cating that these metrics be directly translated into a universal taxonomy 
based on “categorical equivalency.” For example, they suggested that species 
should be classified at the family level when they exhibit a AT „ in the range 
of 9-11°C, and at the sub-ordinal level when AT, = 18-20°C (Table 8.3). 
Whether or not this translation, or perhaps others based directly on evolu- 
tionary time (see the section “Toward a Global Phylogeny ...”, p. 460), even- 
tually is adopted, the comparative molecular perspectives promoted by 
Sibley and Ahiquist will remain an important historical landmark along any 
path that eventually may lead to a universally standardized systematics. 


Mitochondrial DNA and the higher systematics of animals 


In addition to its great utility as a microevolutionary marker at the intraspe- 
cific level (see Chapter 6), mtDNA also is employed widely as an informa- 
tive guide to phylogenetic relationships among higher animal taxa. For 
these latter purposes, sequence regions that evolve more slowly than aver- 
age or unique structural alterations in the molecule are monitored. 
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TABLE 8.3 Suggested levels of genetic divergence (as measured by DNA hybridization) to 
be associated with indicated levels of taxonomic recognition in birds, as per 


recommendations of Sibley et al. (1988) 


Taxonomic category Suffix AT, range Example 


Class — 31-33 Aves 

Subclass -ornithes 29-31 Neornithes 
Infraclass -aves 27-29 Neoaves 
Parvclass -ae 24.5-27 Galloanserae 
Superorder -morphae 22-245 Anserimorphae 
Order -iformes 20-22 Anseriformes 
Suborder -i 18-20 — 

Infraorder -ides 15.5-18 Anserides 
Parvorder -ida 13-15.5 — 

Superfamily -oidea 11-13 — 

Family -idae 9-11 Anatidae 
Subfamily -inae 7-9 Anatinae 

Tribe -ini 4.5-7 Anatini 
Subtribe -ina 2.2-4.5 — 

Congeneric spp. — 0-2.2 Anas (puddle ducks) 


EXTENSIVE MTDNA SEQUENCES. One common approach in higher-animal 
phylogenetics involves direct sequence comparisons of relatively slowly 
evolving mtDNA loci, or portions thereof, such as transversions or non-syn- 
onymous substitutions in protein-coding regions. Several examples already 
have been presented above (e.g., regarding bats, crabs, whales, coelacanths, 
and blackbirds; see also Table 8.1), and more will follow. Especially popular 
have been mitochondrial genes encoding rRNA subunits (e.g., Colgan et al. 
2000; Sullivan et al. 1995), cytochrome oxidases (Hebert et al. 2003; Sena et al. 
2002), and cytochrome b (Lydeard and Roe 1997). However, none of these or 
other mtDNA sequences are without shortcomings for phylogeny estimation 
(Meyer 1994b; Naylor and Brown 1998), and each has a restricted temporal 
window of resolution (Moore and DeFilippis 1997). 

As sequencing procedures have become streamlined and automated, a 
growing approach is to conduct "brute force" analyses of whole (ca. 16 kb 
each) mtDNA genomes (e.g., Kumazawa and Nishida 1999; Mindell et al. 
1999). In a remarkable example of this approach, Miya et al. (2003) gathered 
and phylogenetically analyzed complete mtDNA sequences from 100 
species representing 74 taxonomic families in 26 orders of teleost fishes. For 
any species, such analyses exhaust the information content of the mtDNA 
genome, which nonetheless represents only a single “gene” from a phylo- 
genetic viewpoint, and therefore only a tiny fraction of the hereditary histo- 
ry in any organismal phylogeny. 


ECCENTRIC MTDNA MARKERS. The DNA hybridization studies discussed 
earlier, by providing a numerical “average” genetic distance over a large but 
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unspecified number of low-copy genes, represent one end of a continuum of 
empirical and philosophical approaches in molecular systematics. Near the 
other end of this spectrum are analyses that focus on much smaller numbers 
of molecular features that nonetheless could be of special cladistic relevance 
by virtue of supposed singularities of evolutionary origin. 

One such potential suite of characters involves gene arrangements and 
other unusual compositional properties of mtDNA. The same ensemble of 
about 37 genes constitutes the mtDNA of most animal species, but gene 
orders vary somewhat (Hoffmann et al. 1992; Hyman et al. 1988; Lee and 
Kocher 1995; Macey et al. 1997; Quinn and Mindell 1996; Sankoff et al. 1992; 
Wolstenholme et al. 1985). For example, although identical mtDNA gene 
arrangements are displayed by most placental mammals, amphibians, and 
fishes, the molecule’s tRNA clusters are reordered relative to the main ver- 
tebrate theme in several surveyed marsupials (Janke et al. 1994; Pääbo et al. 
1991). Likewise, gene order in quail was shown to differ from the vertebrate 
norm, in thís case by a transposition of five loci (Desjardins and Morais 
1991). In another example, alternative mtDNA gene orders distinguish two 
primary groups of songbirds (oscines and suboscines) that also are well 
demarcated by morphological and other molecular evidence (Mindell et al. 
1998). A review of mitochondrial genome organization in vertebrates is pro- 
vided by Pereira (2000). 

Distinctive mtDNA gene arrangements have also been employed as 
phylogenetic markers in fungi (Bruns and Palmer 1989; Bruns et al. 1989) 
and in a wide variety of invertebrate animal taxa (Arndt and Smith 1998b; 
Boore and Brown 2000; von Nickisch-Rosenegk et al. 2001). Particular gene 
orders have suggested, for example, that the following groups of 
Arthropoda are monophyletic: the phylum as a whole; Mandibulata 
(insects, myriapods, and crustaceans); and a subclade composed of insects 
plus crustaceans (Boore et al. 1995, 1998), Echinodermata is another inverte- 
brate phylum for which mtDNA gene arrangements have added important 
clues to the cladogram. This assemblage is traditionally divided into five 
classes: Asteroidea (sea stars), Echinoidea (sea urchins), Ophiuroidea (brit- 
tle stars), Holothuroidea (sea cucumbers), and Crinoidea (crinoids). Fossil 
evidence suggests that these taxa split from one another early in the 
Paleozoic (albeit in controversial order; A. B. Smith 1992). In molecular stud- 
ies reviewed by M. J. Smith et al. (1993), mt DNA gene arrangements in ophi- 
uroids and asteroids proved to be similar, but contrasted sharply with the 
layout common to echinoids and holothuroids (crinoids were not assayed). 
Outgroup rooting (against vertebrates) suggested that the asteroid-ophi- 
uroid condition is synapomorphic, thus defining a clade. 

Because mtDNA gene rearrangements are evolutionarily rather rare, 
particular gene orders might therefore seem to provide ideal clade markers, 
but such is not invariably the case. In their survey of mtDNA genomes in 
137 species representing 13 avian taxonomic orders, Mindell et al. (1998) dis- 
covered that one novel gene arrangement had originated independently in 
woodpeckers (Picidae), cuckoos (Cuculidae), songbirds (Passeriformes), 
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and birds of prey (Falconiformes). The authors concluded that although 
mtDNA gene arrangements offer excellent power for clade delineation, it is 
also true that "gene order characters appear susceptible to some parallel 
evolution because of mechanistic constraints." Several of these kinds of 
mechanisms and constraints have been illuminated (Lavrov et al. 2002; 
Macey et al. 1998). 

For a few invertebrates, another phylogenetically useful feature of the 
mtDNA genome is its linear (not circular) condition. Within the phylum 
Cnidaria (corals, anemones, jellyfishes, and allies), ascertainment of rela- 
tionships among taxonomic groups has posed a classic phylogenetic chal- 
lenge due to a paucity of phenotypic characters that are independent of dra- 
matic differences in the life cycles of various forms. A surprising molecular 
discovery was that all surveyed members of the classes Cubozoa, 
Scyphozoa, and Hydrozoa possess linear mtDNA, as opposed to the circu- 
lar mtDNA of Anthozoa, Ctenophora (a supposed outgroup), and most 
other surveyed metazoan phyla (Bridge et al. 1992). Thus, the derived (lin- 
ear) condition appears to define a clade. This finding (through its incorpo- 
ration into PCM exercises) implied that a benthic polyp stage, rather than a 
pelagic medusa form, probably came first in cnidarian evolution. The 
mtDNA genomes of various metazoan groups also show other types of evo- 
lutionary novelties, such as differences in tRNA secondary structures, sizes 
of rRNA genes, peculiar features of the control region, and presence of spe- 
cific introns. These features also can provide phylogenetic signals (Beagley 
et al. 1998; Wolstenholme 1992). 

One especially intriguing set of mtDNA features involves codon assign- 
ments in protein translation (reviewed in Knight et al. 2001a,b). Following 
the initial discovery of modified codon assignments in mammalian mtDNA 
(Barrell et al. 1979), it was found that genetic codes in mitochondria not only 
depart from the "universal" code of nuclear genomes, but also vary across 
taxa. Wolstenholme (1992) summarized available data on mtDNA codon 
assignments in 19 metazoan animals representing six phyla and superim- 
posed the observed differences on a suspected phylogeny (Figure 8.15). 
Several of the patterns made phylogenetic sense. For example, in the 
mtDNAs of all assayed invertebrate phyla (except Cnidaria), AGA and AGG 
specify serine, whereas they are stop codons in vertebrate mtDNA. Thus, 
based on this apparent shared derived condition, vertebrates form a clade 
(see also Boore et al. 1999). On the other hand, evidence for homoplasy in 
codon assignments was also uncovered. The best example involved the 
nucleotide triplet AAA, which in two supposedly unrelated phyla 
(Echinodermata and Platyhelminthes) may have convergently evolved to 
specify asparagine, not lysine as in most other metazoan mtDNAs and in 
the universal code (Figure 8.15). Also, in an ancestor of echinoderms, an 
apparent reversion may have occurred to the presumed ancestral condition 
in which ATA specifies isoleucine rather than methionine. Finally, the AAA 
codon was later discovered to be missing altogether in hemichordates 
(Castresana et al. 1998). 
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Figure 8.15 One evolutionary scenario for alterations of the genetic code in 
metazoan mtDNA. Observed departures from the “universal genetic code” are 
plotted on a provisional phylogeny for metazoan animals based on other evidence. 
Solid bars across branches indicate probable synapomorphies marking various 
clades; open bars indicate probable instances of evolutionary convergence or rever- 
sal. (After Wolstenholme 1992.) 


Although studies of mtDNA genome arrangements have clearly been 
phylogenetically informative, they also have amplified a cautionary note 
regarding all attempts to define clades by idiosyncratic genetic markers con- 
sidered individually. No matter how secure a synapomorphy might appear, 
the possibility of homoplasy seldom can be eliminated entirely. Thus, before 
definitive phylogenetic conclusions are drawn, confirmation from multiple 
independent sources of information should always be attempted. 


Chloroplast DNA and the higher systematics of plants 


What the mtDNA molecule has done for animal higher systematics, cpDNA 
has analogously done for plants. The chloroplast genome is well suited for 
plant phylogenetic analyses for several reasons: it is abundant in plant cells 
and essentially ubiquitous taxonomically; much background information is 
available to facilitate experimental and comparative work; it often houses 
distinctive structural features of cladistic utility; and its general pace of 
nucleotide substitution is moderate to slow (Clegg and Zurawski 1992). 
Again, researchers use two distinct phylogenetic approaches (Olmstead et 
al. 1990): monitoring of taxonomically idiosyncratic molecular features of 
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cpDNA, the rationale being that these features should be especially power- 
ful for clade identification; and sequencing of specific cpDNA genes or 
regions. Sequences cumulatively provide vastly larger pools of genetic 
markers, but on the other hand, character states at individual nucleotide 
positions can be afflicted with higher probabilities of homoplasy. 


EXTENSIVE CPDNA SEQUENCES. In 1993, a landmark paper with 42 authors 
offered a compendium appraisal of seed plant phylogeny based on a huge 
database of nucleotide sequences from the rbcL gene (Chase et al. 1993), 
which encodes the large polypeptide subunit of ribulose-1,5-biphosphate 
carboxylase. This chloroplast gene became a favored sequencing target for 
several reasons: much comparative information had accumulated for it 
early on (Baum 1994; Duvall et al. 1993; Gaut et al. 1992; Kim et al. 1992; 
Soltis et al. 1990); the locus is large (21,400 bp) and provides many phylo- 
genetically informative characters; and its rate of sequence evolution proved 
appropriate for addressing questions about plant phylogeny, especially at 
intermediate and higher taxonomic levels. m subsequent years, sequences 
from additional species and several other genes, notably another plastid 
locus (atpB), a nuclear gene encoding 18S rRNA, and some slowly evolving 
mitochondrial genes, were added to the mix (Parkinson et al. 1999; Qui et al. 
1999; Soltis et al. 1998, 1999). 

Summary gene sequence phylogenies for hundreds of seed plant 
species are shown in Figure 8.16. Many of the clades had been identified in 
earlier studies, but seldom with the high levels of statistical support that the 
combined genetic data provided. The following are among many notable 
conclusions reached for angiosperms (flowering plants) (Figure 8.16A): 
eudicots form a large monophyletic group, as do angiosperms at a higher 
clade level; magnoliids, which include monocots, form a distinct clade that 
branched off early in angiosperm evolution; plants that engage in nitrogen- 
fixing symbioses with nodulating bacteria form a subclade within the 
eurosid I clade; and many "model species" that are used widely in genetic 
and evolutionary research, such as Arabidopsis (mustards), Brassica (cab- 
bages and allies), and Gossypium (cotton), lie within the rosid clade and, 
hence, encompass only a tiny fraction of total phylogenetic diversity in seed 
plants. Several molecular conclusions disagreed with conventional wisdom, 
as evidenced by the fact that some DNA-based clades within the core eudi- 
cots cut across subclass boundaries in previous classifications (Soltis et al. 
1999). The molecular findings thus have given impetus to a revision of 
angiosperm ordinal classifications (Nyffeler 1999). 

Gymnosperms (naked-seeded plants) have been phylogenetically ana- 
lyzed in similar fashion (Bowe et al. 2000; Chaw et al. 2000). Some primary 
conclusions are as follows (Figure 8.16B): extant gymnosperms are mono- 
phyletic; cycads are the most basal clade of gymnosperms, followed by 
Ginkgo; and members of the long-problematic order Gnetales are allied to 
conifers. Also based on molecular evidence, two other major groups of vas- 
cular plants—ferns and club mosses (or lycophytes)—probably branched off 
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Figure 8.16 Consensus phylogenies for representative seed plants based on 
DNA sequence data from multiple cytoplasmic and nuclear loci. (A) Molecular 
phylogeny primarily for angiosperms. Numbers in parentheses are tallies pf 
species examined (560 total, but not all are pictured). (After Soltis et al. 1999. ) (B) 
Molecular phylogeny emphasizing relationships among gymnosperms. (After 
Chaw et àl. 2000). In both diagrams, numbers on branches indicate levels of statis- 
tical support for various clades. 
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prior to the "seed plant" clade (gymnosperms plus angiosperms). Apart 
from these analyses of vascular plants, molecular data have also contributed 
to knowledge of the deeper phylogeny of land plants, which also include 
mosses, hornworts, and liverworts. Emerging results from this vast phylo- 
genetic effort are summarized and periodically updated on the Deep Green 
Web site (ucjeps.berkeley.edu / TreeofLife / hyperbolic.php). 

One advantage of sequence analyses is that general estimates of diver- 
gence times (as well as branching patterns) can be attempted, using appeals 
to molecular clock considerations (Olmstead et al. 1992; Wolfe et al. 1989). In 
taking advantage of this utility, Wikstróm et al. (2001) used molecular find- 
ings in conjunction with fossil-based evidence to estimate geologic times for 
more than 500 nodes in the angiosperm tree. For example, they estimated 
that the crown group of extant angiosperms arose in the Early to Middle 
Jurassic (about 179-158 mya), and that the eudicot lineage originated in the 
Late Jurassic to mid-Cretaceous (147-131 mya). 
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It has not gone unnoticed (Rieseberg and Brunsfeld 1992) that molec- 
ular conclusions in plant higher systematics have been based to a large 
degree on the same molecule (cpDNA) for which reticulate evolution due 
to introgression has been documented between several closely related 
plant species (see Chapter 7). Could this phenomenon also have led to dis- 
crepancies between cpDNA gene trees and organismal trees in deep plant 
phylogeny? In theory, yes, but probably seldom in practice, according to 
Clegg and Zurawski (1992): "It is reasonable to assume that the approxi- 
mation to organismal history will improve as time increases, because the 
biases introduced by interspecific hybridization or intraspecific polymor- 
phism will diminish with an increase in time scale." Nevertheless, plant 
phylogenies estimated from cpDNA sequences (or any other single-gene 
genealogies) always remain provisional pending corroboration from addi- 
tional loci. 


ECCENTRIC CPDNA MARKERS. Phylogeny estimation in plants has also ben- 
efited from relatively unusual or idiosyncratic features of cpDNA. Such 
phylogenetic markers include inversions, losses of genes and introns, and 
losses of a large inverted repeat region of the molecule (Downie and Palmer 
1992). 

With some notable exceptions, such as those involving Pisum (Palmer et 
al. 1988b), Trifolium (Milligan et al. 1989), and conifers (Strauss et al. 1988), 
gene order is normally a conservative feature of cpDNA in vascular plants. 
This is illustrated by the fact that the arrangement of cpDNA genes in tobac- 
co (Nicotiana tabacum) is similar to the presumed ancestral order also found 
in most other examined angiosperms, ferns, and Ginkgo (Palmer et al. 
1988a). Furthermore, when gene order differences have been discovered, 
one or a few inversions usually appear to be responsible. For example, 
cpDNAs from bryophytes, mosses, and lycopsids differ from those of most 
vascular plants by a 30-kb inversion (Calie and Hughes 1987; Ohyama et al. 
1986; Raubeson and Jansen 1992a), one of the few large structural alterations 
in cpDNA accepted over the hundreds of millions of years of evolution 
involved. This inversion appears to be a synapomorphy for vascular plants 
minus the lycopsids, and it provisionally identifies a basal evolutionary split 
among vascular plants. 

One of the first and most comprehensive phylogenetic studies of a 
cpDNA rearrangement involved a 22-kb inversion found to be shared by 57 
genera representing all tribes of Asteraceae (sunflowers), a taxonomic fami- 
ly with more than 20,000 species in 1,100 genera (Jansen and Palmer 1987). 
Absence of this inversion from the subtribe Barnadesiinae of the Mutisieae 
tribe, and from all families allied to Asteraceae, suggested that 
Barnadesiinae represents the most basal lineage of Asteraceae and that, con- 
trary to earlier opinion, Mutisieae is not monophyletic. These conclusions 
were supported by congruent results obtained from phylogenetic analyses 
of cpDNA restriction sites (Jansen and Palmer 1988) and sequences (Kim et 
al. 1992; see review in Jansen et al. 1992). 
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Unlike the typical case for mtDNA in most higher animals, cpDNA 
genes often carry numerous introns (Shinozaki et al. 1986). Losses of entire 
introns (in contrast to oft-observed changes in intron length) seem to be rel- 
atively rare events, and hence are phylogenetically informative when they 
are observed. For example, the absence of an intron in the rp}2 gene marked 
all examined members of Caryophyllales (Downie et al. 1991). Caution in 
the use of such markers was also indicated, however, because this particu- 
lar intron apparently was lost independently in at least five unrelated dicot 
lineages (Downie et al. 1991). Apart from introns, evolutionary losses of par- 
ticular cpDNA genes (perhaps often to the nucleus; see below) also have 
been employed as phylogenetic signals (Downie and Palmer 1992). 

Most land plant cpDNAs possess a large (20~-30-kb) inverted repeat (IR) 
region. A shared deletion of one copy of this repeat indicated probable 
monophyly for six tribes in the subfamily Papilionoideae (Lavin et al. 1990). 
Although this character state ("IR loss") is phylogenetically informative 
locally, it again could be a misleading guide to global phylogeny because IR 
losses seem to have occurred independently in more than one plant group 
(Doyle et al. 1992). Conifers, for example, also possess only one IR element 
(Lidholm et al. 1988; Raubeson and Jansen 1992b). 

In general, convergent evolutionary gains of rare features are even less 
likely than convergent losses (although both types of events can be troubled 
by homoplasy). Hence, the shared possession of de novo genomic additions 
should be of special significance in clade delineation. The phylogenetic ori- 
gin of embryophytes (land plants) has long intrigued botanists. One candi- 
date sister group was Charophyceae (a particular group of green algae), a 
possibility consistent with early molecular findings on cpDNA. All previ- 
ously examined algae, as well as eubacteria, lacked introns in their tRNA^l* 
and tRNA" genes, whereas all assayed embryophytes, as well as charo- 
phyceaens, possessed them. This observation was interpreted to indicate the 
evolutionary acquisition of a genetic novelty, perhaps marking a clade of 
land plants plus Charophyceae (Manhart and Palmer 1990). However, a later 

j discovery was that mosses, hornworts, and all major vascular plant lineages 

: share three mtDNA introns that are not possessed by green algae or other 
eukaryotes. This discovery was taken as refined evidence that even better 
candidates for the closest ancestors of land plants exist (Qui et al. 1998). In 
general, at the time of this writing, the whole topic of early plant (and 
eukaryote) origins is under intense discussion and possible revision (e.g., 
Cavalier-8mith 2003; Cavalier-Smith and Chao 2003; Nozaki et al. 2003). 


Ribosomal gene sequences and deep phylogenies 


Phylogenetic analyses of slowly evolving rDNA sequences have also been 
hugely important in resolving deep and intermediate branches in the Tree of 
Life. The main function of rRNAs is protein synthesis, so it is not surprising 
that genomes of all organisms contain sequences coding for these essential 
molecules (Wheelis et al. 1992). Starting mostly in the late 1980s, many 
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researchers began to take advantage of the phylogenetic markers provided 
by these genes at meso- and macroevolutionary levels (e.g., Edman et al. 
1988; Fuhrman et al. 1992; Gerbi 1985; Hamby and Zimmer 1992; Hedges et 
al. 1990; Hillis and Dixon 1989, 1991; Hori et al. 1985; Jorgensen and Cluster 
1988; Kumazaki et al. 1983; Mindell and Honeycutt 1990; Nairn and Ferl 
1988; Sogin et al. 1989). The number of such studies has grown explosively 
since then, a fact evidenced, for example, by the journal Molecular 
Phylogenetics and Evolution, in which 346 of 965 articles (36%) published dur- 
ing its first 11 years (1992-2002) included phylogenetic analyses of rRNA 
genes. 

A few examples of this approach, as applied to early animal lineages, 
are listed in Table 8.4, and Figure 8.17 illustrates the impressive amount of 
historical information that can be recovered from even a single ribosomal 
gene (in this case, the sequence encoding the 185 rRNA subunit in diverse 
metazoan lineages). On the other hand, it is also understood that sequence 
data from individual genes (ribosomal or otherwise) can be misrepresenta- 
tive or even "positively misleading" (Felsenstein 1978b) regarding ancient 
organismal relationships. One well-known danger is "long-branch attrac- 
tion" (Hendy and Penny 1989), in which unrelated lineages falsely appear 
to constitute ancient clades because of backward, parallel, and convergent 
nucleotide substitutions that tend to accumulate over long evolutionary 
time and increase the ratio of phylogenetic noise to signal. Later sections of 
this chapter will suggest further reasons to retain a cautious view of phylo- 
genies estimated from nucleotide sequences of just one or a few genes. 


Genomic Mergers, DNA Transfers, and Life's Early History 


From Greek antiquity to recent times, a common notion was that living 
organisms could be divided into two kingdoms: animals and plants. 
Following invention of the microscope and the discovery of microbes, a dif- 
ferent view came into vogue, which subdivided all of life into prokaryotes 
(microorganisms lacking a membrane-bound nucleus) and eukaryotes 
(organisms consisting of cells with true nuclei). Later, in the 1960s, a pro- 
posal by Whittaker (1959) became popular, in which five primary kingdoms 
were recognized: prokaryotes, unicellular eukaryotes, fungi, plants, and 
animals. 

A breakthrough occurred in 1977 when Carl Woese and colleagues ana- 
lyzed 168 rRNA gene sequences and concluded that all living systems 
should be divided in a different fashion, along what appeared to be distinct 
phylogenetic lines of descent (Fox et al. 1977, 1980; Woese 1987; Woese and 
Fox 1977). Their scenario (Figure 8.18) proposed that all forms of life could 
be classified into three "domains" above the rank of kingdom: Eucarya 
(eukaryotic organisms, or at least the nuclear component of their cells); 
Bacteria (previously Eubacteria); and Archaea (formerly Archaebacteria), 
including methanogens and thermophilic forms. This view was generally 
supported by sequence analyses of additional rRNA genes (Gouy and Li 
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Figure 8.17 Estimate of metazoan phylogeny based on 18S rDNA sequences. 


Caenorhabditis elegans . 


Numbers on branches indicate statistical support for putative clades. This diagram 
is intended merely to illustrate rDNA approaches and should not be interpreted as 
definitive regarding metazoan relationships. (From Winnepenninckx et al. 1998.) 
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Figure 8.18 One of the first extensive reconstructions of deep phylogenetic topol- 
ogy in the Tree of Life, inferred by Woese and colleagues from sequences of small- 
subunit rRNA genes. Codings indicate phylogenetic distributions of three physiolog- 
ical traits as indicated by their occurrence within at least some members of the taxo- 
nomic groups pictured. 
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Group Reference 

Invertebrates 3 Field et al. 1988 
Deuterostomes Cameron et al. 2000 
Deuterostomes, chordates Winchell et al. 2002 
Molting animals Aguinaldo et al. 1997 
Protostomes Maliatt and Winchell 2002 
Bilaterians Peterson and Eernisse 2001 
Ctenophora Podar et al. 2001 
Metazoans Bromham et al. 1998; Giribet 2002 
Basal groups Medina et al. 2001 

Reptiles Hedges and Poling 1999 
Jawless fishes Mallatt and Sullivan 1998 
Bony fishes Obermiller and Pfeiler 2003 
Bony fishes Chen et ai. 2002 


Cartilaginous fishes Douady et al. 2003 
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1989) and other loci (see below). The molecular differences separating extant 
members of these lineages appeared to be "of a more profound nature than 
the differences that separate typical kingdoms, such as animals and plants" 
i (Woese et al. 1990). 

: One immediate question of interest concerned where the root might 
H belong on the Eucarya-Bacteria-Archaea tree. This issue was addressed not 
only using rRNA genes (e.g., Hori and Osawa 1987), but also from a cladis- 
tic standpoint by examining distributions of shared derived features post- 
dating the separations of two primordially duplicated genes: those encod- 
ing elongation factors and ATPase subunits (Gogarten et al. 1989; Iwabe et 
; al. 1989). All of these studies concluded that Archaea and Eucarya constitute 
I sister lineages within a larger clade, such that the root of life lies somewhere 
i within the Bacteria. This view is still favored today, but not without major 
: qualifications relating to ancient intergroup mergers and lateral DNA trans- 
fers (Ribeiro and Golding 1998; see following sections). 

Although some competing interpretations of Woese’s data and the deep- 
branching structure of life were voiced early (see Day 1991; Lake 1991), there 
could no longer be any doubt that major phylogenetic lineages, previously 
unrecognized, exist among prokaryotic life. With regard to taxonomy, some 
authors nonetheless contended that the differences in levels of biological 
organization between prokaryotes and eukaryotes remain so profound that 
continued recognition of these two traditional assemblages is desirable (Mayr 
1990, 1998). Other researchers retorted that a failure to formalize Archaea and 
Bacteria as higher taxa would perpetuate an artificial and flawed classification 
1 that disregards phylogeny (Woese 1998a; Woese et al. 1991). This debate rais- 
es a general question: Should classifications strictly reflect branching struc- 








Authors' conclusions 

| Cnidarians are separate from other animal lineages 
Echinoderms + hemichordates form a clade, as do chordates 
Echinoderms + hemichordates form a clade; lancelets sister to chordates 
Arthropods, nematodes, and other "molters" form a clade 
Supported conclusions of molting study above; identified subclades 
Annelids group with mollusks and other taxa with spiral cleavage 
Comb jellies are monophyletic, related most closely to cnidarians 
Early metazoan phylogeny evaluated against supposed Cambrian explosion 
Bilateria + Cnidaria + Ctenophora + Metazoa form a clade 
Crocodilians and turtles form a clade; squamates at base of reptile tree 
Cyclostomes (lampreys and hagfishes) are monophyletic 
Fishes with leptocephalus larvae are not demonstrably monophyletic 
Main lineages of higher teleosts identified 
Sharks form a clade; rays and skates are basal to elasmobranch lineage 
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tures in evolutionary trees, or should perceived grades of organismal resem- 
blance somehow be incorporated as well? This is a subjective (albeit an oper- 
ationally important) question whose answer depends on one's view of the 
nature of information that a formal classification should convey. 

Regardless of their taxonomic implications, the original studies by 
Woese and colleagues stimulated molecular examinations of many more taxa 
and genes in efforts to clarify life's deep phylogeny (for updated reviews, see 
Olsen and Woese 1997; Wolf et al. 2002). So too did a proposal by Cavalier- 
Smith (1983) to formally recognize an assemblage (Archezoa) of single-celled 
eukaryotes hypothesized to be so old that they might have predated the 
endosymbiotic origins of mitochondria and chloroplasts (see next section). 
Subsequent molecular research soon intimated that several eukaryotic pro- 
tists, such as diplomonads, entamoebas, euglenids, slime molds, and tri- 
chomonads, do indeed encompass incredible phylogenetic diversity (Knoll 
1992; Sogin 1991), with many lineages probably tracing back independently 
across billion-year time frames (Knoll 1999). Recent molecular findings also 
indicate, however, that at least some of these "Archezoa" lost their organelle 
genomes secondarily, after the endosymbiotic mergers (Roger 1999). In any 
event, various lineages of eukaryotic protists are certainly extremely ancient, 
as illustrated by a recently published consensus phylogeny for eukaryotes 
(Figure 8.19). To appreciate the relative divergence scales involved, note in 
particular the placement of "animals" (nested inside the opisthokonts) with- 
in this broader genealogical framework. 


From ancient endosymbioses to recent intergenomic transfers 


ORIGINS OF EUKARYOTIC CELLS. Traditionally, two hypotheses were 
advanced to account for the distinctive nuclear and cytoplasmic genomes 
(mtDNA and cpDNA) found in modern eukaryotic cells. One hypothesis 
(discussed in Cavalier-Smith 1975; Uzzell and Spolsky 1981) stipulated that 
organelle genomes arose autogenously within eukaryotes as fragments from 
the nuclear genome were incorporated into membrane-encased mitochon- 
dria or chloroplasts. The competing hypothesis stipulated that organelle 
genomes had exogenous origins, stemming from bacteria that invaded (or 
were engulfed by) proto-eukaryotic host cells bearing precursors of the 
nuclear genome (Margulis 1981, 1995). Beginning mostly in the 1980s, this 
"endosymbiont theory" received considerable molecular support from phy- 
logenetic analyses of several genes and gene families (Bremer and Bremer 
1989; Gray et al. 1989; Howe et al. 1992; Kishino et al. 1990; Lockhart et al. 
1992; W. Martin et al. 1992; Schwartz and Dayhoff 1978; Van den Eynde et 
al. 1988; Van de Peer et al. 1990; Villanueva et al. 1985). 

The type of molecular evidence supporting the endosymbiont theory is 
illustrated by cytoplasmic small-subunit rRNA genes, which show much clos- 
er phylogenetic affinities to rDNA sequences in Bacteria than they do to 
nuclear rDNA sequences in their own eukaryotic host cells (see Figure 8.18). 
Specifically, it was shown that representative mitochondrial rDNAs phyloge- 
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Figure 8.19 Deep phylogeny for eukaryotes. This diagram represents a (highly 
provisional) consensus picture from extensive molecular data (e.g., Baldauf et al. 
2000; Hirt et al. 1999; Katz and Sogin 1999; Keeling et al. 2000) and ultrastructural 
morphology. (After Baldauf 2003.) 


netically group with those of the o-subdivision of purple proteobacteria 
(Cedergren et al. 1988; Yang et al. 1985), and that chloroplast rDNAs group 
with those of photosynthetic cyanobacteria (Giovannovi et al. 1988). Likewise, 
analyses of several other genes (e.g., Cammarano et al. 1992; Morden et al. 
1992; Pühler et al. 1989; Recipon et al. 1992) supported an alliance of mtDNA 
and cpDNA with Bacteria rather than Eucarya, and also confirmed the over- 
all distinctness of Archaea (as first proposed by Woese and colleagues). 

Alternative viewpoints persist about the finer details of the original or 
primary endosymbiotic events. For example, some authors argue for a poly- 
phyletic origin of chloroplasts (Stiller et al. 2003), whereas most other 
researchers interpret available molecular and other evidence as indicative of 
ultimate monophyly for all plant plastids, albeit probably with some sec- 
ondary symbiotic transfers and many genomic rearrangements at later 
times (Palmer 2003; Turmel et al. 2002). There seems to be less doubt that 
mitochondria arose singularly near the base of eukaryotic evolution 
(Palmer 2003). 
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In any event, further data from more genes and protein sequences have 
evidenced an even more complicated and interesting phylogenetic history 
than is suggested in Figure 8.18. For example, eukaryotic nuclear genes 
encoding proteins involved in transcription and translation often cluster 
phylogenetically with Archaea, whereas those for metabolic proteins often 
tend to cluster with Bacteria (Rivera et al. 1998). Furthermore, the metabol- 
ic proteins of eukaryotes often group with those of a-proteobacteria, and 
those of plants often group with cyanobacteria. These results were again 
interpreted to reflect endosymbiotic origins of mitochondria and plastids 
from these respective microbial assemblages, but with the mergers followed 
by functional transfers of various symbiont genes to host nuclei (Golding 
and Gupta 1995; Gupta 1998; Lang et al. 1999; Margulis 1996). 


GENETIC COMMERCE BETWEEN CELL ORGANELLES AND NUCLEI. Post- 
endosymbiotic transfers of cytoplasmic DNA to the nucleus had long been 
suspected from the observation that modern mitochondrial and chloroplast 
genomes house only a small subset of the genes required for their own 
replication and expression, with complementary functions being encoded 
by nuclear DNA (Gray 1992; Wallace 1986). Focused studies of specific 
DNA sequences (e.g., Kubo et al. 1999), as well as computer database 
searches (e.g., Blanchard and Schmidt 1995), have confirmed earlier sug- 
gestions that physical transfers (recent as well as ancient) of genetic mate- 
rial between organelle and nuclear genomes have been relatively common 
during evolution (Baldauf and Palmer 1990; Baldauf et al. 1990; Fox 1983; 
Gantt et al. 1991; Gellissen et al. 1983; Nugent and Palmer 1991; Stern and 
Palmer 1984). These events are sometimes referred to as intracellular gene 
transfers (IGT). 

In animals, most inter-genomic DNA movements that eventuated in 
actual transfers of function probably occurred soon after the endosymbiotic 
mergers (as evidenced, for example, by the fact that nearly all of the dozens 
of fully sequenced mtDNA genomes of diverse metazoan animals contain 
exactly the same set of 13 protein-coding genes). Recent migrations of ani- 
mal mtDNA sequences to the nucleus have proved to be pervasive as well 
(Collura and Stewart 1995; Mourier et al. 2001; M. F. Smith et al. 1992; 
Sorenson and Fleischer 1996; Williams and Knowlton 2001; Woischnik and 
Moraes 2002), but most such sequences trafficked to the nucleus are now 
functionless pseudogenes (Bensasson et al. 2001). Unless care is taken (e.g., 
by examining for stop codons), researchers sometimes can misinterpret 
these nucleus-housed pseudogenes as bona fide cytoplasmic mtDNA (e .g., 
when using PCR-based assays with mtDNA primers; Collura et al. 1996; 
Zhang and Hewitt 1996). If the true origin of a transferred sequence goes 
unrecognized, this can create interpretive errors in phylogenetic assess- 
ments (analogous to traditional problems of confusing paralogy with 
orthology in analyses of multi-gene families), and it can even lead to diag- 
nostic mistakes in medical or other applications for "mtDNA" markers 
(Wallace et al. 1997). 
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In plants, many taxa possess a nearly identical set of about 40 different 
mtDNA genes, presumably retained from algal ancestors more than a billion 
years ago. Nonetheless, some angiosperms have repeatedly and recently 
lost mitochondrial genes to the nucleus (Adams et al. 2000, 2001, 2002), and 
some of these ongoing IGTs are functionally effective. For example, Adams 
et al. (1999) documented multiple shifts of mitochondrial cox2 sequences to 
nuclei, followed by inactivation of the gene in one genome or the other. 
Chloroplast DNA sequences are also heavily involved in this type of inter- 
genomic commerce (Huang et al. 2003; Millen et al. 2001; Thorsness and 
Weber 1996). 

The ancient lineage mergers discussed above, and undoubtedly others 
early in the history of life (Delwiche 1999; Hartman and Fedorov 2002; 
Margulis and Sagan 2002), plus subsequent inter-genomic gene traffic 
(Syvanen 2002; see also next section), mean that eukaryotic cells and their 
constituent genomes are phylogenetic “mosaics” (Ribeiro and Golding 1998) 
or "chimeras" (Katz 1999) housing mixtures of distinct evolutionary lines. 
Thus, “phylogeny” in the primordial biological world was probably anasto- 
motic or network-like (Figure 8.20), rather than mostly branched and hier- 
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Figure 8.20 Schematic representation of genomic mergers and lateral transfer 
events of the kind that probably characterized early life. Included in the diagram 
are the acquisitions by Eucarya of mitochondrial and chloroplast genes from pro- 
teobacteria and cyanobacteria, respectively. (After Doolittle 1999.) 
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archical, as it was later to become (Woese 1998b, 2002). Such possibilities 
raise important issues about phylogeny estimation, and even about the 
meaning of deep organismal phylogeny. On the other hand, some "core" 
genes that were never exchanged may exist, and these genes may yet permit 
reconstruction of mainstream relationships among ancient life forms 
(Doolittle et al. 2003). 

Ancient genomic mergers represent biological reality, so the mosaic 
nature of modern genomes is not merely an artifact of phylogenetic "sam- 
pling noise." Furthermore, the phylogenetic information in mosaic 
genomes can be turned to advantage in historical reconstructions. In one 
such example, Hedges et al. (2001) synthesized and integrated available 
molecular data regarding eukaryotic origins with evidence about physical 
conditions on primordial Earth (Figure 8.21). Their results suggested an 
early divergence (ca. 4.0 bya) between Archaea and the archaeal genes 
now present in eukaryotic cells; subsequent occurrences of at least two 
genetic mergers (at 2.7 and 1.8 bya) that transferred genes from Bacteria 
into eukaryotes; an early phylogenetic separation for Giardia, an unusual 
eukaryote with only tiny remnant mitochondria (Roger et al. 1998; Tovar 
et al. 2003); and the appearance of cyanobacteria (presumably the origina- 
tors of oxygen-generating photosynthesis) immediately prior to the earli- 
est undisputed evidence for the rise of oxygen in the Earth's primitive 
atmosphere (ca. 2.5 bya). These intriguing suggestions will merit further 
critical evaluation. 
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Figure 8.21 Postulated relationships between key phylogenetic events deep in 
the history of life and their temporal relationship to environmental conditions of 
primordial Earth and its atmosphere. (After Hedges et al. 2001.) 
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Horizontal gene transfer 


Genetic transmission is overwhelmingly "vertical"; that is, genes are transmit- 
ted from parents to offspring. If this were not true, heredity would have little 
meaning, and phylogenetic trees built from independent characters would sel- 
dom exhibit the coherent structures that tend to characterize most well-stud- 
ied taxonomic groups. As described above, however, remarkable instances of 
"horizontal gene transfer" (HGT) have occurred in evolution—for example, 
during the probable endosymbiotic mergers that led to eukaryotic cells. 

Apart from these ancient endosymbioses, have HGT events shuttled 
particular nucleotide sequences between otherwise isolated species? The 
unequivocal short answer to this question is "yes." Of current scientific 
interest are the frequencies, mechanisms, and evolutionary consequences 
of such lateral gene movements (Gogarten 2003; Gogarten et al. 2002; 
Syvanen and Kado 2002). HGT must be distinguished from interspecies 
gene movement via introgressive hybridization (see Chapter 7), which is 
merely a special case of vertical heredity. For the purposes of the current 
discussion, HGTs between taxa must also be distinguished from IGTs 
between organelle and nuclear genomes strictly within a genetic lineage (as 
described above). 


MOLECULAR CRITERIA FOR INFERRING HGT. Provisional molecular support 
for HGT usually comes from one or another of the patterns described in 
Table 8.5. However, all of these patterns represent indirect or surrogate lines 
of evidence, rather than definitive documentations of HGT phenomena. 
This often leaves ample room for alternative interpretations, and several 
authors have justifiably criticized particular empirical claims for HGT 
(Eisen 2000; Koski et al. 2001; Ragan 2001). 

For example, one oft-used class of evidence for HGT between taxa is a 
gross discrepancy between the apparent phylogeny for a given segment of 
DNA and an overwhelming consensus phylogeny for those taxa based on 
other data (Gogarten 1995; M. W. Smith et al. 1992). In such cases, an HGT 
event is postulated to have produced the aberrant phylogeny for the "odd- 
man-out" sequence. However, several evolutionary processes other than 
HGT can also lead to apparent incongruence between the phylogeny of a 
particular gene sequence and the broader phylogeny of the genome (Figure 
8.22): shared retention of ancestral states by the taxa in question; pro- 
nounced heterogeneity in molecular evolutionary rates across lineages; con- 
vergent evolution; introgressive hybridization; mistaken assumptions of 
orthology for loci that actually are paralogous; and idiosyncratic gene loss 
in separate lineages. Thus, extreme caution is indicated in deducing HGT 
events from "discordant" phylogenetic signatures alone. 

Indeed, some of the earliest reports of HGT were soundly criticized (by 
Lawrence and Hartl 1992; Leunissen and de Jong 1986; Shatters and Kahn 
1989; Steffens et al. 1983) for failure to eliminate competing possibilities. 
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TABLE 8.5 Four molecular criteria by which putative instances of horizontal gene transfer 
often are inferred 





Criterion Description 








1. Incongruent trees Phylogenetic trees for specific DNA sequences or proteins show 
discordant relationships with known or suspected phyloge- 
netic trees for the taxa in which they are housed. Caution is 
indicated, however, because incongruent trees can also result 
from several other evolutionary forces (see Figure 8.22). 
Nonetheless, satisfaction of this criterion often provides the 
strongest available line of evidence for HGT events. 


2. Unusual nucleotide Shifts in nucleotide composition between neighboring 
composition sequences (such as an increase in frequency of GC over AT 

base pairs) could indicate that foreign genes or coding 
sequences exist as "islands" in the genome. Natural selection 
or mutation bias in the absence of HGT could mimic this 
outcome, however. Also, the signal of a bona fide ancient 
HGT event will normally attenuate over time, making older 
HCGT events difficult to detect by this criterion. 


3. Unusual species In a particular species, the presence of a gene that otherwise is 
distributions of genes found in distant relatives but not in close relatives could sig- 
nal an HGT event from the distant taxon. Alternative explana- 
tions should be eliminated, such as the possibility of gene loss 
in the intervening lineages. Also, this criterion clearly cannot 
work for genes that are taxonomically universal. 


4. Unexpected homology “BLAST” searches (Altschul et al. 1997) of computer databases 
patterns might indicate that a given gene sequence in the species of 
interest shows significant similarity (putative homology) to a 
gene or genes otherwise known only from distant taxonomic 
groups. This method provides a “quick and dirty” screen for 
possible HGT events, but for many reasons is error-prone. 


Source: After Brown 2003, and Eisen 2000. 


Two studies that were challenged involved suspected movements of a 
superoxide dismutase gene from a fish into its bacterial symbionts 
(Bannister and Parker 1985; Martin and Fridovich 1981) and of a glutamine 
synthetase gene from a plant into its bacterial symbionts (Carlson and 
Chelm 1986). However, many other early reports of HGT were less easily 
dismissed (e.g., Bork and Doolittle 1992; Doolittle et al. 1990; Heinemann 
1991; Heinemann and Sprague 1989; Kidwell 1992; Mazodier and Davies 
1991; Smith and Doolittle 1992; M. W. Smith et al. 1992; Sprague 1991; 
Zambryski et al. 1989), and compelling evidence has gradually accumulat- 
ed for such events in a wide diversity of eukaryotic as well as prokaryotic 
taxa. 
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Figure 8.22 Seven competing hypotheses to account for an apparent character-state 
discordance that otherwise might be attributable to a horizontal gene transfer event. 
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PROKARYOTE-PROKARYOTE HGT. Horizontal genetic exchanges between 
prokaryotic species can occur via several routes, including plasmid 
exchange (even between distantly related taxa), transduction (viral-mediat- 
ed transfers), and transformation (uptake and incorporation of DNA from 
the environment). Such processes have long been known, but their full 
impact remains a matter of contention. In one key study, Lawrence and 
Ochman (1998) reported that Escherichia coli has experienced no less than 
230 HGT events during the last 70 million years, and that 755 of its 4,288 
protein-coding genes (17.6%) are of foreign origin (presumably introduced 
mostly by plasmids, phages, and transposable elements). The authors con- 
sidered these to be minimum estimates because evidence for older inser- 
tions probably blurs with time, and because many sequence insertions may 
be evolutionarily transient. Of course, all molecular evidence for these 
events was necessarily indirect, involving various criteria (described in 
Table 8.5) that are potentially subject to challenge. 

Similar molecular analyses soon suggested that HGT events had been 
fairly common in evolution, moving pieces of DNA among a wide variety 
of prokaryotic species (Figure 8.23), including some apparent transfers 
between Bacteria and Archaea (e.g., Forterre et al. 2000; Nelson et al. 1999). 
For example, Rest and Mindell (2003) inferred at least four lateral gene i 
transfers of retroids (elements bearing reverse transcriptases) from Bacteria 
to Archaea (the latter were previously not known to possess such elements). 
The evidence presented for putative HGT events is sometimes quite com- 
pelling, and some authors now view lateral gene movement in prokaryotes 
to be so prevalent as to require fundamental reorientations of thought about 
the nature of bacterial taxa (Lawrence 2002) and the basis of innovative bac- 
terial evolution (Jain et al. 2002, 2003; Koonin et al. 2001; Ochman et al. 
2000). Others reserve judgment pending more definitive molecular evidence 
for ubiquitous HGT in prokaryotic evolution (Eisen 2000; Ragan 2001). Still 
other researchers, while acknowledging that HGT events may be fairly com- 
mon in bacterial evolution, suggest that extensive DNA sequence compar- 
isons of many genes nonetheless will permit robust phylogenetic recon- 
structions of a single primary tree topology for microbial lineages (e.g., 
Daubin et al. 2003; Lerat et al. 2003). 


PROKARYOTE-EUKARYOTE HGT. Lateral gene transfers from prokaryotes 
to eukaryotes are also well known (Klotz and Loewen 2003). One of the 
first-discovered examples involved Agrobacterium tumefaciens, the bacterial 
agent of “crown gall” disease in plants. This bacterium infects wounded 
sites on a tree, resulting in tumor-like growth. During the process, it also 
transfers some of its genetic material (T-DNA, carried on its plasmids) into 
the plant’s nuclear genome. This type of HGT is so effective that purpose- 
fully engineered strains of A. tumefaciens are now employed routinely by 
biotechnologists as transformation vectors for introducing specific trans- 
genes into commercially valuable plants (see Avise 2004c). Another inter- 
esting example of prokaryote-to-eukaryote HGT involves glycosyl hydro- 
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Figure 8.23 Horizontally acquired DNA in 18 sequenced bacterial genomes. The 
light gray portion of each horizontal bar indicates the fraction of the DNA 
sequence thought to be native to that species. The darker gray portion indicates the 
suspected fraction of the DNA sequence (also shown numerically as a percentage) 
that is thought to be of foreign origin. (After Ochman et al. 2000.) 


lase genes that apparently jumped from bacteria to fungi in the rumens of 
cattle, thereby providing the fungi with a useful capacity to degrade the 
cellulose and other plant polysaccharides in that environment (Garcia- 
Vallvé et al. 2000). i 

It is a remarkable irony that if HGT events between prokaryotes and 
microbial eukaryotes have indeed been common in evolution, they might 
actually call into question HGT's original poster-child scenario: the 
endosymbiosis theory. To some authors (e.g., Andersson et al. 2003), HGT is 
so pervasive, and "fusion" theories so implausible mechanistically (Rotte 
and Martin 2001), as to demand a rigorous reexamination and a possible 
reinterpretation of the evolutionary origins of eukaryotic cells. Perhaps, 
they suggest, these nucleated cells arose from "routine" HGT events 
between microbes, rather than from wholesale genomic amalgamations all 
at once. 
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The frequency of natural HGTs from prokaryotes to metazoan animals, 
including humans, is also under current debate. One provocative claim 
emerged from an initial analysis of the draft human genomic sequence 
(Lander et al. 2001), in which it was suggested that at least 223 genes from 
prokaryotes had been imported into the human nuclear genome. The evi- 
dence consisted primarily of unexpected homology patterns (criterion 4 in 
Table 8.5); also, many of those genes seemed not to be the result of transfers 
from the human mitochondrial genome. However, subsequent analyses of 
expanded data sets and reinterpretations of the molecular evidence chal- 
lenged these HGT claims on several grounds (Genereux and Logsdon 2003; 
Salzberg et al. 2001; Stanhope et al. 2001)—for example, by documenting 
that many sequences purportedly imported from bacteria actually have 
closer phylogenetic ties to other eukaryotic genes. 


EUKARYOTE-EUKARYOTE HGT. One of the earliest convincing cases of HGT 
between eukaryotic species involved "P elements" in fruit flies (Daniels et al. 
1990). These transposable elements have a patchy phylogenetic occurrence 
confined mostly to Drosophila and related dipteran genera (Perkins and 
Howells 1992). A remarkable molecular discóvery was that P-element 
sequences in D. melanogaster are nearly identical to those in D. willistoni, 
despite a suspected evolutionary separation of these host species of tens of 
millions of years. Furthermore, close relatives of melanogaster appear to lack P 
elements entirely, whereas these elements are widespread in species of the 
willistoni group. The compelling conclusion was that an HGT event must have 
moved proliferative P elements from the willistoni complex into melanogaster, 
probably within the last century (Kidwell 1992). A semiparasitic mite 
(Proctolaelaps regalis) may have been the mediating vector (Houck et al. 1991). | 
Other pioneering studies recognized HGT events among as well as | 
within various animal, plant, fungal, and microbial taxa (Calvi et al. 1991; 
Flavell 1992; Mizrokhi and Mazo 1990; Simmons 1992). Such findings were 
merely the tip of the iceberg, however, as many such lateral genetic transfers 
by mobile elements are now well documented (Jordan et al. 1999; Kapitonov 
and Jurka 2003; Kidwell 1993; McDonald 1998; Rosewich and Kistler 2000). $ 
For example, Cho et al. (1998; see also Cho and Palmer 1999) estimated that | 
mobile self-splicing introns had invaded cox1 genes by cross-species HGT 
on at least 1,000 independent occasions during angiosperm evolution. The 
extent and genomic consequences of such lateral movements of nucleic 
acids may be profound, as emphasized, for example, by Brosius (1999) who 
concluded that "genomes were forged by massive bombardments with 
retroelements and retrosequences" (mobile sequences that insert into | 
genomes via reverse transcription). 
Furthermore, once a mobile element invades a lineage, it often replicates 
dramatically therein. Incredibly, at least 50%, and probably much more, of 
the human genome (like the genomes of most other eukaryotes) consists of 
remnants of retrotransposable elements and retroviruses that invaded the 
host genome, proliferated, and whose copies are now in various stages of 
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expansion or decay (Promislow et al. 1999). To cite just one example, molec- 
ular phylogenetic analyses indicate that one family of retroviral sequences 
invaded the proto-human genome anciently, stayed quiescent for eons, and 
then underwent a burst of transpositional activity approximately 6 million 
years ago, coinciding in time with the separation of proto-human and proto- 
chimpanzee lineages (Jordan and McDonald 2002). 

Most HGT events between eukaryotes probably involve mobile genetic 
elements, but several recently documented instances of HGT between unre- 
lated plants seem harder to explain by this route. These cases involve mito- 
chondrial housekeeping genes (encoding ribosomal and respiratory pro- 
teins) that appear to have transferred during evolution between unrelated 
species, yielding outcomes that include gene duplications, recaptures of 
functional genes formerly lost by a lineage, and appearances of chimeric loci 
whose sequences are essentially half-monocot and half-dicot (Bergthorsson 
et al. 2003). Perhaps vectoring agents such as viruses, bacteria, fungi, insects, 
or pollen were involved, or perhaps there has been transformational uptake 
of plant DNA from the soil. Whatever the explanation, the authors suggest 
that such HGT events between higher plants may be reasonably frequent on 
an evolutionary time scale of millions of years. With regard to higher ani- 
mals, another potential route for HGT—via food ingestion—has been the 
subject of discussion and some limited experimentation (Doolittle 1998; 
Schubbert et al. 1997). 

For any HGT event between eukaryotes to be "successful," the foreign 
DNA must somehow enter germ line cells and then be passed to successive 
generations. Although such horizontal DNA transfer is far too infrequent to 
overturn conventional genetic wisdom about the overwhelming predomi- 
nance of vertical transmission in eukaryotic evolution, it nonetheless is 
proving to be at least an occasional contributor to the taxonomic and evolu- 
tionary distributions of particular DNA sequences in multicellular organ- 
isms (Bushman 2002; Syvanen and Kado 2002). 


Relationships between retroviruses and transposable elements 


Many suspected instances of HGT involve transposable elements (TEs; see 
Box 1.3), classes of DNA sequences that seem to be predisposed to such 
inter-taxon movements by virtue of their inherent proclivity to shift from 
one chromosomal site to another (albeit typically within cell lineages). The 
mechanisms by which TEs occasionally escape the confines of a host lineage 
to colonize other taxa are poorly understood, but several possible routes 
exist, such as by hitchhiking on viruses or parasites. Once inside host cells, 
some TEs (class I retrotransposable elements) can transpose proliferatively 
by reverse transcription of RNA intermediates, whereas others (class II ele- 
ments) merely jump from spot to spot by DNA-to-DNA transposition mech- 
anisms (Finnegan 1989). Most TEs have characteristic structures that include 
gene sequences coding for enzymes involved in the transposition process, 
usually flanked by terminal repeat sequences of varying lengths. In the 
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retrotransposable elements (RTEs), one of these genes encodes reverse tran- 
scriptase (RT), which catalyzes the transcription of RNA to DNA. 

Particularly intriguing are the biological and structural similarities 
between RTEs and retroviruses (RVs). Retroviruses are small, single-strand- 
ed RNA viruses that resemble RTEs in several ways, including the produc- 
tion of a reverse transcriptase and the presence of long terminal repeats 
(LTRs) flanking the coding region. However, like other viruses and unlike 
RTEs, retroviruses can encase themselves in a protective envelope that facil- 
itates independent infectious transport across the cells of the same or differ- 
ent organisms. These observations raised an interesting evolutionary ques- 
tion (Doolittle et al. 1989; Finnegan 1983): Might RTEs represent degenerate 
retroviruses that secondarily lost much of their facility for autonomous 
intercellular transport? Or, alternatively, did retroviruses evolve from ances- 
tral RTEs by secondary acquisition of these capabilities? 

In addition to its presence in both RTEs and RVs, a gene for reverse tran- 
scriptase is found in several other genetic elements, including the hepad- 
naviruses of animals and the caulimoviruses of plants. The RT gene also 
exhibits structural similarities (suggestive of shared ancestry) to the RNA- 
directed RNA polymerases of some other viruses. Xiong and Eickbush 
(1990) took advantage of available nucleotide sequences from reverse tran- 
scriptase genes and RNA polymerase genes to estimate a molecular phy- 
logeny for more than 80 RT-containing genetic elements (Figure 8.24). Their 
retroelement phylogeny consisted of two primary branches: one leading to 
non-LTR retrotransposons, the other leading to LTR retrotransposons, retro- 
viruses, caulimoviruses, and hepadnaviruses. Most members within each of 
these five named assemblages grouped together in terms of RT phylogeny. 
The authors concluded that retroviruses probably evolved from retrotrans- 
posable elements rather than vice versa (as evidenced by RVs’ restricted 
position in the broader phylogeny of RT-housing elements). 

However, other researchers have questioned whether conventional phy- 
logenetic concepts really apply to different assemblages of viruses and other 
mobile genetic elements (e.g., Bushman 2002). Their contention is that differ- 
ent groups may actually be independent in origin, or contain idiosyncratic 
amalgamations of genes that themselves have been subject to convergent 
functional evolution as well as possible sequence exchanges among different 
ancestors. Thus, the presumptive phylogeny in Figure 8.24 is somewhat con- 
troversial with regard to overall retroelement history. Nonetheless, it does 
serve to illustrate how molecular appraisals are stimulating novel ideas about 
phylogenetic relationships among even some of the simplest "life forms." 


Further Topics in Molecular Phylogenetics 


Toward a global phylogeny and universal systematics 


Figure 8.25 illustrates how molecular phylogenies can be estimated for par- 
ticular taxa (the fungus Fusarium oxysporum, in this case) at levels spanning 
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Figure 8.24 Possible phylogeny for retroelements based on RT-like sequences, 
with structural features of the elements superimposed. Boxes represent stretches of 
coding sequence: black areas, RT-like region; leftmost dark gray areas, gag gene 
region; lighter gray areas, integrase region; rightmost dark gray area in retrovirus- 
es, envelope gene; white terminal areas, LTRs. A hypothesized ancestral structure is 
shown in parentheses. (After Xiong and Eickbush 1990.) 


the full evolutionary spectrum: from intraspecific genealogy to distant rela- 
tionships early in the history of life. Within the next decade or two, as many 
more genes and taxa are surveyed and molecular data are assembled and 
integrated with information from traditional systematics and paleontology 
(Tudge 2000), it should become possible to reconstruct much of the 
(Super)Tree of Life (Beninda-Emonds et al. 2002; Pennisi 2001; Sugderr et al. 
2003). This synthesis will stand as one of the great achievements in the his- 
tory of biology, not only for its magnitude of effort, but also for its seminal 
importance in such diverse areas as conservation biology (Vázquez and 
Gittleman 1998), the study of adaptations (Mooers and Heard 1997), and 
many other kinds of evolutionary hypothesis testing (Beninda-Emonds et al. 
1999). Indeed, a properly reconstructed Tree of Life will be indispensable as 
a historical "road map" for orienting evolutionary knowledge and guiding 
virtually all research in comparative biology (Avise 2008b). 

Several points should be made about this ongoing enterprise. First, 
although a complete Tree of Life would include micro-genealogies for each of 
the world's tens of millions of species, the hierarchically nested and pyram- 
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Figure8.25 Early empirical example of a tiered phylogenetic assessment based 
on molecular data. Shown from right to left are relationships among strains of the 
wilt fungus, Fusarium oxysporum, from mtDNA restriction analyses; and, based on 
small-subunit rRNA gene sequences, the estimated position of this species within 
the filamentous ascomycetes, and the position of these fungi within the broader 
phylogenetic hierarchy of life. For additional and updated details about deep 
molecular relationships among fungal and related lineages, see Baldauf and Palmer 
(1993), Heckman et al. (2001), and Figure 8.19. (Compiled from diagrams and infor- 
mation in Bowman et al. 1992; Bruns et al. 1991; Gaudet et al. 1989; and Jacobson 
and Gordon 1990.) 


idal nature of phylogeny means that fewer and fewer appraisals will be 
required at successively deeper (more inclusive) nodes. Thus, it should be 
feasible to achieve a consensus phylogeny for many, if not all, major and 
intermediate branches in the Tree of Life, plus detailed snapshots of "twig" 
relationships among select populations and species in the Tree's current 
outer canopy. 

Second, interest in reconstructing the Tree of Life has spawned the 
development of various analytical phylogenetic techniques, such as "matrix 
representation with parsimony" (Baum 1992; Ragan 1992), that can knit dis- 
parate small phylogenies into “supertrees.” Such procedures (Figure 8.26) 
are necessary for two primary reasons: different types of genetic data are 
differentially informative at varying temporal depths in a tree, due largely 
to variation in molecular evolutionary rates; and the number of possible 
branching orders in multi-taxon supertrees is much larger than available 
computer algorithms can search exhaustively for optimality criteria. The 
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Figure 8.26 General philosophy of supertree amalgamation from smaller, over- 
lapping phylogenies. 


solution is to compartmentalize the problem by estimating manageable 
phylogenies for subsets of data and taxa, then amalgamate these into com- 
posite pictures using regions of overlap and suitable representative infor- 
mation from the separate trees (Purvis 1995; Sanderson et al. 1998, 2003). 
Third, the reconstructed Tree of Life ideally will include dated nodes. 
Purely cladistic appraisals aimed at resolving branch topology are important, 
especially for applications such as phylogenetic character mapping, but they 
| risk diverting attention from the equally important topics of branch lengths 
| and divergence times. For example, the fungal cladograms depicted in 
Figure 8.25 (like many such tree diagrams in the current literature) were 
based on procedures designed primarily to recover branch topology, but 
knowledge of absolute divergence times would enrich the representations 
greatly. Phylograms (trees with branch lengths and, preferably, dated nodes) 
are far more difficult to estimate than mere cladograms, in part because pale- 
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ontology-based estimates usually postdate true lineage divergence times 
(due to missing fossils), whereas molecular-based estimates often predate 
true divergence times (due to statistical biases in the estimation procedures; 
Rodriguez-Trelles et al. 2002). However, a rapprochement between these two 
kinds of dating exercises is likely to be forthcoming as suitable correction fac- 
tors are implemented and as both of these sources of temporal information 
about evolution improve with more data (Benton and Ayala 2003). 

Nonetheless, even rough estimates of evolutionary time can greatly 
expand the information conveyed by a supertree. Empirical molecular stud- 
ies supporting this contention (including some described in detail in earlier 
sections of this chapter) have explicitly addressed temporal issues for some 
of life's earliest lineages (Doolittle et al. 1996; Hasegawa and Fitch 1996; 
Heckman et al. 2001; Wang et al. 1999), as well as for metazoan phyla (Ayala 
et al. 1998), major clades of animals (Knoll and Carroll 1999; Lynch 1999), 
seed plants (Wikstróm et al. 2001), higher vertebrate taxa (Nei et al. 2001), 
amphibians (Bossuyt and Milinkovitch 2001), carnivorous mammals 
(Beninda-Emonds and Gittleman 2000; O’Brien et al. 1999), and many small- 
er taxonomic groups, such as the world’s squirrels (Mercer and Roth 2003). 
These and numerous other branches in the Tree of Life now have provision- 
ally dated nodes stemming from various combinations of molecular and 
paleontological data. 

A fourth point concerns an underlying assumption of the Tree of Life: that 
it is based primarily on histories of vertical rather than horizontal genetic 
transmission. But if lateral DNA transfer between branches has been evolu- 
tionarily common, as appears to be true particularly in the prokaryotic and 
early eukaryotic worlds, then some portions of the Tree of Life may be exten- 
sively reticulate, rather than strictly non-anastomotic. Especially in such cases, 
an organismal genome is a genuine mosaic of evolutionary pasts, and the 
challenge is to disentangle and interpret the histories of its separate parts. For 
this and other reasons relating to distinctions between phylogenies of specif- 
ic genes and a phylogeny of species (see Chapter 4), treelike depictions of 
organismal history should be conceptualized as prevailing genomic trends at 
best, with interesting and oft-specifiable exceptions. 

One final point concerns nomenclatural issues. Assuming that the Tree 
of Life, or at least major portions of it, can be reliably approximated using 
molecular and other data, excellent opportunities will be afforded to devel- 
op more informative taxonomies. Unfortunately, current biological classifi- 
cations are fundamentally flawed (Ereshefsky 2001) because they fail to 
standardize ranking criteria across different kinds of organisms. No direct 
comparability now exists between a genus or a family of mammals and their 
counterpart taxa in fishes, much less in invertebrates, plants, or microbes. 
One suggested response to this problem is to abandon the Linnaean hierar- 
chical system altogether and erect instead a rankless nomenclature (de 
Queiroz and Gauthier 1992, 1994). Another possibility is to retain a Linnaean 
system (or some other nomenclatural analogue) that makes classifications 
explicitly phylogenetic and also standardized. 
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Figure8.27 Pictorial explanation of “temporal banding” as applied to a hypo- 
thetical phylogeny for 25 extant species. Under this proposed scheme, clades are 
assigned taxonomic ranks defined by specified windows of evolutionary time. 
(After Avise and Johns 1999.) 


Hennig (1966) proposed one means for creating such a phylogenetic sys- 
tem: Let the categorical rank of each taxon denote its geologic age. In practice, 
the ranks could be traditional Linnaean categories and subcategories (a full 
list is given in Mayr and Ashlock 1991), or a different recording system, such 
as alphanumeric code, could be used. The basic idea, elaborated by Avise and 
Johns (1999), is that "temporal bands" superimposed on phylograms would 
provide deciding criteria for assignment of taxonomic rank (Figure 8.27). The 
exact boundaries and widths of these bands are in principle arbitrary, but once 
agreed upon and ratified by the systematics community, would be universal- 
ly applied. All extant species that last shared a common ancestor during a 
specified window of time would be united into a taxonomic family, for exam- 
ple, and those tracing to a common ancestor within successively deeper win- 
dows of time would be placed in a superfamily, suborder, and so forth. To 
retain manageability, only the deepest clades within a window would be 
given formal taxonomic recognition (ie, every named taxon would be a 
clade, but not every clade would be a formal taxon). This also means that all 
evolutionary lineages traversing a given temporal band (without coalescing 
inside that band with other such traversing lineages) would be afforded tax- 
onomic distinctions at that categorical level. 
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A primary advantage of this hypothetical scheme, apart from its explic- 
it focus on phylogeny, is that it adopts absolute evolutionary time as classi- 
fication’s universal common denominator (rather than questionable second- 
order surrogates, such as magnitudes of genetic divergence or oft-incompa- 
rable levels of phenotypic divergence among disparate organismal groups). 
Thus, nomenclatures per se would become far more informative and phylo- 
genetically meaningful. The actual classifications would probably differ 
considerábly from those in present use. Many anciently separated metazoan 
lineages might warrant elevation to higher categorical ranks, for example, 
and various rearrangements would be entailed at lower taxonomic levels as 
well (Figure 8.28). 

As this book testifies, numerous twigs, branches, and limbs in the Tree 
of Life have been analyzed to varying extents using molecular genetic mark- 
ers, and initial attempts have been made to root and assemble them into 
more comprehensive phylogenetic pictures. In some ways, molecular phy- 
logeneticists are at preliminary stages of biotic description analogous to 
those provided by European naturalists in the 1700s and 1800s, during their 
explorations of a newly discovered world. At that time too, vast quantities 
of systematically relevant biological data were being gathered rapidly, and 
an important challenge was to catalog, interpret, and synthesize the findings 
into a broad comparative framework. 


Molecular paleontology 


Molecular appraisals normally are directed toward extant organisms, with 
phylogenetic inferences representing extrapolations to mutational changes 
and cladistic events of the past. A long-standing dream of molecular evolu- 
tionists has been to assess extinct biota more directly, through recovery of 
biological macromolecules from fossil material. 

In 1980, Prager and co-workers reported a phylogenetic signal retained 
in the serum albumin proteins of a 40,000-year-old mammoth (Mammuthus 
primigenius) whose carcass had been preserved in the frozen soil of eastern 
Siberia. In immunological tests, rabbits injected with ground mammoth 
muscle produced antibodies that reacted strongly with albumins from 
extant Indian and African elephants, weakly with sea cows (in a related tax- 
onomic order, Sirenia), and still more weakly or not at all with other mam- 
malian albumins. Using similar assays, Lowenstein et al. (1981) showed that 
albumins from the extinct Tasmanian wolf (Thylacinus cyanocephalus) pro- 
duced phylogenetically informative levels of immunological reaction 
against albumins from other extant Australian marsupials. The preserved 
tissue in that study was dried muscle from museum specimens collected in 
the late nineteenth and early twentieth centuries. Apart from a few such 
examples involving fortuitously well-preserved or recent tissues, most other 
attempts to extract genetic information from fossil proteins met with little 
success (Hare 1980; Wyckoff 1972). 
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Figure 8.28 Examples of taxonomic disparity (with respect to both clade structure 
and evolutionary time) in existing classifications of primates and fruit flies. Under 
traditional taxonomic assignments (families in parentheses), Pongidae is para- 
phyletic to Hominidae; also, several families of anthropoid primates share common 
ancestors far more recently than did many fruit fly species placed within the single 
genus Drosophila. Also shown is one way in which these disparities could, in prin- 
ciple, be rectified using time-standardized taxonomic ranks (listed across the top} 
under a temporal-banding framework. (After Avise and Johns 1999.) 


Early studies of ancient DNA (Pääbo 1989) fared somewhat better. In the 
first successful retrieval of phylogenetically informative DNA sequences 
from museum material, Higuchi et al. (1984, 1987) recovered short mtDNA 
segments from a 140-year-old study skin (salt-preserved) of the extinct 
quagga (Equus quagga), an African species with an enigmatic mixture of 
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Figure 8.29 Phylogenetic tree based on mtDNA sequences from the extinct 
quagga and extant members of the genus Equus. (After Páábo et al. 1989.) 


horselike and zebralike features. Fragments of DNA isolated from dried 
muscle and connective tissue were cloned into a lambda phage vector, and 
sequences totaling 229 base pairs were obtained and compared against 
those of extant species in the family Equidae. Phylogenetic analyses of these 
and additional molecular data (George and Ryder 1986; Paabo and Wilson 
1988) showed that the quagga had been closely related to the Burchell zebra, 
Equus burchelli (Figure 8.29). 

In 1985, Pääbo reported the isolation and biological cloning of nuclear 
DNA pieces from an Egyptian mummy 2,400 years old. One year later, Doran 
et al. (1986) reported the successful extraction of DNA from human brain tis- 
sue 8,000 years old, which had been buried in a swamp in central Florida. 
Notwithstanding these and a few other apparent success stories, traditional 
isolation procedures seldom yielded ancient DNA sequences in a sufficient 
state of preservation to be of practical utility for phylogenetic comparisons. 

Much excitement, therefore, attended early PCR-based attempts to 
recover ancient DNA from museum materials and fossil templates (Pääbo et 
al. 1989). Especially in the late 1980s and early 1990s, many researchers (see 
Brown 1992) claimed to have sequenced various PCR-amplified fragments 
of "fossil DNA" from remains of plants and animals that died as long as 
many tens of millions of years ago (e.g., Cano et al. 1992, 1993; DeSalle et al. 
1992; Golenberg et al. 1990; P. S. Soltis et al. 1992a; Woodward et al. 1994). 
Alas, most such reports now seem highly implausible; almost certainly, 
researchers mistakenly amplified modem DNA sequences that had contam- 
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inated the fossil material (Austin et al. 1997; Hofreiter et al. 2001; Poinar 
2002; Wayne et al. 1999). 

When an organism dies, its DNA is normally degraded quickly, initial- 
ly by nucleases in the cells themselves and later by relentless exogenous 
forces, including background radiation as well as oxidative and hydrolytic 
processes (Lindahl 1993a,b). Low temperatures, high salt concentrations, 
and desiccation can slow some of the degradation, but the physical chem- 
istry of DNA is such that even under superb preservation conditions, native 
sequences hundreds of nucleotides long seldom (if ever) survive as suitable 
templates for PCR for more than about a million years. Indeed, even under 
the best of natural circumstances, the oldest fossil material from which suit- 
able DNA fragments can be recovered with current technology is now 
believed to be about 50-100 millennia. 

Retrieval of ancient DNA from well-preserved fossils less than 100,000 
years old sometimes ís possible, but the PCR assays are fraught with the 
danger of sample contamination from pervasive modern DNA (Cooper and 
Poinar 2000). Thus, all claims of authenticity for fossil DNA must be viewed 
with skepticism unless stringent criteria have been met (Hofreiter et al. 
2001). These criteria may include the use of appropriate controls (e.g., mock 
extractions and PCR reactions without template), a demonstration that iden- 
tical PCR products emerge from multiple extracts, quantification of the 
number of template DNA molecules (PCR amplifications are problematic 
when less than 1,000 molecules initiate the process), observation of an 
inverse relationship between DNA fragment length and amplification effi- 
ciency (short sequences amplify more readily), and reproducibility of out- 
comes by independent investigators. Furthermore, the laboratory itself must 
be routinely bleached and UV-irradiated to destroy contaminating modern 
DNA, and it must be physically separated from other work areas. At the 
very least, researchers must use protective clothing and face shields. 

One helpful approach to ensuring the authenticity of ancient DNA also 
serves as a preliminary molecular screen for precious fossils and museum 
specimens. Most amino acids can exist in two mirror-image isomeric forms 
(L and D) that rotate plane-polarized light in opposite directions. In living 
tissues, however, the L-forms greatly predominate due to the action of spe- 
cialized racemase enzymes. After an organism dies and racemase activity 
halts, some of the L-isomers gradually convert to D-isomers at predictable 
rates. This chemical process is the basis for conventional "racemization" 
dating of fossils, but it can also be used to address whether fossil tissue sam- 
ples might be well enough preserved to contain endogenous DNA (Poinar 
et al. 1996). Starting with tiny bits of museum tissue, the extent of racemiza- 
tion can be measured, and the samples thereby screened for potential suit- 
ability as sources of bona fide ancient DNA. 

The danger of sample contamination is especially acute when dealing 
with ancient hominid remains (Stoneking 1995). Nonetheless, a developing 
success story in molecular paleontology has involved the retrieval of DNA 
from fossil bones of several Neanderthal specimens (Krings et al. 1997, 2000; 
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Ovchinnikov et al. 2000). Homo sapiens neanderthalensis was a morphologi- 
cally distinct hominid that appeared in Europe and western Asia at least 
200,000 years ago and persisted until about 30,000 years ago, thus overlap- 
ping temporally with modern humans (H. sapiens sapiens), including Cro- 
Magnons. Long-standing debates have centered on reproductive and phy- 
logenetic relationships between these taxonomic subspecies. Using 
Neanderthal fossils from widely separated sites in Europe, mtDNA 
sequences were recovered and found to fall essentially outside the range of 
mtDNA sequence variety observed among humans worldwide today. (The 
fact that Neanderthal sequences were distinctive and phylogenetically uni- 
fied was further testimony to their own authenticity) The magnitude of 
sequence divergence suggests that the matrilineal separation between 
Neanderthals and the ancestors of modern humans occurred more than half 
a million years ago, a date considerably older than estimates of the matri- 
lineal coalescent time for all extant humans (ca. 200,000 years bp). 
Furthermore, molecular analyses of 24,000-year-old fossils of anatomically 
modern humans of the Cro-Magnon type indicated that the mtDNA 
sequences of these individuals fell well within the range of sequence varia- 
tion in living humans (Caramelli et al. 2003), although in this case contami- 
nation with modern DNA was almost impossible to eliminate as an expla- 
nation. If these latter molecular findings involving Cro-Magnons are valid, 
the genetic distinction between H. s. sapiens and H. s. neanderthalensis 
assumes even greater potential biological significance. 

PCR-based sequence comparisons between deceased and modern 
DNAs have similarly been achieved for several other animal groups. These 
studies typically involve mtDNA because targeted fragments of these cyto- 
plasmic genomes occur in much higher abundance than single-copy 
nuclear genes, and because suitable PCR primers are often available. For 
example, Thomas et al. (1989) used mtDNA sequences from the extinct 
Tasmanian wolf to confirm that this species was related more closely to 
other Australian marsupials than to carnivorous marsupials in South 
America (see also Krajewski et al. 1997). Janczewski et al. (1992) character- 
ized mitochondrial and nuclear sequences from 14,000-year-old bones of 
the sabre-toothed cat (Smilodon fatalis) from tar pits in Los Angeles, thereby 
uncovering the phylogenetic position of this extinct species within the evo- 
lutionary radiation of Felidae. 

Other extinct animals from which phylogenetically informative DNA 
sequences have been recovered include the mastodon (Mammut americanum; 
Yang et al. 1996), woolly mammoth (Mammuthus primigenius; Noro et al. 
1998), blue antelope (Hippotragus leucophaeus; Robinson et al. 1996), Steller's 
sea cow (Hydrodamalis gigas; Ozawa et al. 1997), ground sloth (Mylodon dar- 
inii; Höss et al. 1996), cave bear (Ursus spelaeus; Hänni et al. 1994; Hofreiter 
et al. 2002; Loreille et al. 2001; Orlando et al. 2002), pig-footed bandicoot 
(Chaeropus ecaudatus; Westerman et al. 1999), a bovid from the Balearic 
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Islands (Myotragus balearicus; Lalueza-Fox et al. 2000), and various flightless 
(Thambetochen, Ptaiochen) and flighted (Anas) waterfowl from Hawaii 
(Cooper et al. 1996; Sorenson et al. 1999b). In 2001, Cooper and colleagues 
were the first to publish complete mtDNA genomic sequences from fossil 
remains (two species of extinct moas from New Zealand). Two years later, 
two research groups (Bunce et al. 2003; Huynen et al. 2003) were perhaps the 
first to use a sex-linked molecular marker to identify the gender of individu- 
als from fossil remnants of extinct species (also moas). 

In some instances, mtDNA sequences have been amplified from multiple 
fossil specimens or geographic sites and compared with modern samples, 
permitting direct temporal assessments of population genetic or phylogeo- 
graphic patterns. For example, in the brown bear (Ursus arctos), distinctive 
matrilines currently restricted to different North American regions apparent- 
ly co-occurred in a Beringian population 36,000 years ago, as judged by com- 
parisons of fossil and extant mtDNAs (Leonard et al. 2000; see also Barnes et 
al. 2002). In a similar study of horses (Equus caballus) based on several late 
Pleistocene fossils (12,000 to 28,000 years old), genetic diversity was found to 
be high, but the mtDNA lineages fell mostly in one distinct portion of the 
broader matrilineal tree for extant horse populations (Vila et al. 2001). 
Lambert et al. (2002) recovered mtDNA sequences from 7,000-year-old sub- 
fossil bones of Adélie penguins (Pygoscelis adeliae) to assess population genet- 
ic changes in a hypervariable region of the molecule, as did Hadly et al. (1998) 
for mtDNA cytochrome b sequences using 2,400-year-old remains of pocket 
gophers (Thomomys talpoides). Across a more recent time frame, Guinand et al. 
(2002) used archived scales from lake trout (Salvelinus namaycush) that resided 
in the Upper Great Lakes during the 1940s and 1950s as a source of microsatel- 
lite DNA to compare against allele frequencies in extant populations. Modern 
collections showed reduced genetic variation, probably due to severe popula- 
tion declines of this species during the mid-twentieth century. 

Studies of ancient DNA have even been used to examine ecological con- 
ditions of the past. For example, by scrutinizing animal mtDNAs and plant 
cpDNAs recovered from 12,000-year-old packrat middens in a Chilean 
desert and comparing these fossil sequences against those of extant species, 
Kuch et al. (2002) identified a number of animals (a vicufia, two rodents, and 
a bird) plus several plant species representing taxonomic families no longer 
found at the site today. Results suggest a diverse biota and a more humid 
climate at that location when the middens were deposited. Other amazing 
studies in molecular paleontology used cpDNA sequences from coprolites 
(fossil dung) to reconstruct the diet of an extinct herbivorous ground sloth 
(Nothrotheriops shastensis) that inhabited Nevada in the late Pleistocene 
(Hofreiter et al. 2000; Poinar et al. 1998). Finally, from ancient human feces 
2,000 years old, both cpDNA and mtDNA sequences were recovered by PCR 
and used to deduce the diet of omnivorous Native Americans at a cave site 
in Texas (Poinar et al. 2001b). 
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SUMMARY 


1. Phylogenetic hypotheses underlie virtually all conclusions in comparative 
organismal evolution. Phylogenetic character mapping (PCM) has become a 
popular means of making these hypotheses more explicit and testable. In 
PCM, particular organismal features are matched with their associated species 
on a cladogram, with the purpose of revealing the evolutionary histories of 
those traits. These independent appraisals of phylogeny are now based rou- 
tinely on molecular markers. 


N 


. Phylogenetic character mapping against a molecular backdrop has been 
accomplished for numerous anatomical, physiological, and behavioral features 
of plants, animals, and microbes. Some organismal traits have proved to be 
monophyletic, others polyphyletic. Concepts of gradients and thresholds in 
the phylogeny of quantitative traits have also been stimulated by PCM exercis- 
es. 


uw 


. Molecular markers are used widely in biogeographic assessment: for example, 
to test dispersalist versus vicariance explanations for the appearance of related 
taxa in disjunct geographic regions, to distinguish between common ancestry 
and convergence as an explanation for organismal similarities among different 
regional biotas, and to reconstruct biogeographic histories of island inhabi- 
tants. 


4. Phylogenetic assessments above the species level have been based on many 
molecular approaches, including DNA hybridization, immunological methods, 
and restriction site analyses, but nucleotide sequencing has become the 
method of choice in recent years for examining slowly evolving nuclear and 
cytoplasmic genes in a macroevolutionary context. The phylogenetic content 
of the sequences themselves, plus that of eccentric molecular features such as 
alternative gene orders, presence versus absence of particular introns, or pat- 
terns of codon assignment, offer many special opportunities for clade delin- 
eation. 


5. Lateral genetic transfers were probably common early in the history of life, the 
most notable examples being the endosymbioses that eventuated in the dis- 
tinctive nuclear and cytoplasmic genomes of eukaryotic cells. Such reticulation 
events raise important questions about the frequency of DNA exchanges 
across lineages near the base of the Tree of Life, and even about the fundamen- 
tal meaning of organismal “phylogeny.” Nonetheless, molecular phylogenetic 
studies are revealing major features in the evolutionary histories of even the 
simplest and most ancient forms of life. 


6. In addition to ancient inter-genomic exchanges of nucleic acids, horizontal 
gene transfers (HGTs) have proved to be far more common in subsequent evo- 
lution than formerly supposed. Through the various types of phylogenetic 
"signatures" or "footprints" that HGTs produce, many contemporary as well 
as historical reticulation events (often involving transposable elements) have 
been provisionally documented within and between prokaryotes and eukary- 
otes. Caution is called for in reading such historical signatures, however, 
because several non-HGT evolutionary processes can mimic their effects. 


7. In the near future, prospects are great for developing a global or universal 
phylogeny for all of life. Molecular methods will play a huge role in that 
endeavor. Important questions have arisen about how to analyze, interpret, 
Standardize, and taxonomically summarize the wealth of molecular genetic 
information that is now becoming available. 
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8. With the advent of PCR, molecular phylogenetic appraisals have been extend- 
ed to ancient DNA sequences recovered from creatures no longer alive. In 
exceptional circumstances, "fossil DNAs" have even been extracted and phylo- 
genetically analyzed at community-level scales and from well-preserved mate- 
rials up to tens of thousands of years old. 
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Molecular Markers in 
Conservation Genetics 


Modern biology has produced a genuinely new way of looking at the world 
,.. to the degree that we come to understand other organisms, we will place 
à greater value on them, and on ourselves. 


E. O. Wilson (1984) 


Conservation genetics is a subdiscipline within the broader field of conservation 
biology (Meffe and Carroll 1997). It has sometimes been characterized primarily 
as the study of inbreeding effects and losses of adaptive genetic variation in small 
populations (Frankham 1995), but this is an unduly narrow characterization, as 
this chapter will attest. A brief history of major developments in the young but 
expanding field of conservation genetics is presented in Box 9.1. 

In the final analysis, biodiversity is genetic diversity. As we have seen, this 
genetic diversity is genealogically arranged across diverse temporal scales: from 
family units, extended kin groups, and phylogeographic population structures 
within species to graded magnitudes of genetic divergence among species that 
became phylogenetically separated at various times in the evolutionary past. As 
we have also learned, visible phenotypes of organisms are not infallible guides to 
the way in which this genealogical diversity is arranged. Sadly, even as powerful 
molecular tools have become available to assess this genetic variety in exciting 
new ways, the marvelous biodiversity that has carpeted our planet is being lost 
at a pace that is nearly unprecedented in the history of life (Ehrlich and Ehrlich 
1991). Biodiversity is in serious decline, with, for example, approximately 50% of 
vertebrate animal species and 12% of all plants now considered vulnerable to 
near-term extinction, mostly as a result of effects of habitat alteration associated 
with human population growth (Frankham et al. 2002). Earth recently entered the 
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BOX 9.1 Brief Chronology of Some Key Developments 


in the History of Conservation Genetics 


Many other important contributions could also be listed, but the following pro- 
vide representative examples and a general time frame of events. 


1966 


1973 


1975 


1979 


1980 


1982 


1983 


1984 


1985 


1986 


1987 


1988 


1989 


Lewontin and Hubby introduce protein electrophoretic techniques to 
population biology. 

The Endangered Species Act sets a legal precedent in the United States 
for identifying and conserving rare taxa. 


Frankel and Hawkes edit a volume focusing on management of crop 
genetic resources; Martin edits a volume on captive breeding of endan- 
gered species. 


Ralls and colleagues draw attention to the wide occurrence of inbreed- 

ing depression in captive populations (for later updates, see Ralls et al. 

1988; Frankham 1995). Avise and colleagues, and independently Brown 
and colleagues, introduce mtDNA approaches to population biology. 


Soulé and Wilcox publish the first of several conservation books with 
an evolutionary genetic as well as an ecological orientation (see also 
Frankel and Soulé 1981; Soulé 1986, 1987; Soulé and Kohrn 1989). 


Laerm and colleagues publish the first multifaceted genetic appraisal of 
the taxonomic status of a wild endangered species. 


Schonewald-Cox and colleagues edit the first major volume devoted 
explicitly to genetic perspectives in conservation. O'Brien and col- 
leagues initiate a series of studies on inbreeding, heterozygosity, and 
population bottlenecks in wild felids. Mullis invents the polymerase 
Chain reaction technique for in vitro amplification of DNA (see Mullis 
1990). 


Templeton and Read initiate an influential series of studies on eliminat- 
ing inbreeding depression in captive gazelles. 

The Society for Conservation Biology is formed. Jeffreys and colleagues 
introduce DNA fingerprinting methods. 


Ryder brings the phrase "evolutionarily significant unit" to wide atten- 
tion in conservation biology. 


Ryman and Utter edit a volume on population genetics in fisheries 
management. The first issue of Conservation Biology (Blackwell) is pub- 
lished, complementing earlier journals such as Biological Conservation 
and Journal of Wildlife Management. Avise and colleagues introduce the 
term "phylogeography" and outline the field's major principles. 

Lande draws focus to genetic versus demographic concerns for small 
populations. 

The Captive Breeding Specialist Group begins a series of population 
viability analyses (PVAs) for endangered taxa (see review in Ellis and 
Seal 1995). The U.S. Fish and Wildlife Service opens a laboratory facility 
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(in Ashford, Oregon) devoted explicitly to wildlife forensics. 
Microsatellites are introduced (by Tautz as well as Weber and May, and 
others) as à source of highly polymorphic nuclear markers. Avise pro- 
motes novel roles for molecular genetics in the recognition and conser- 
vation of endangered species. 


1990  Hillis and Moritz edit a volume summarizing the many laboratory 
genetic approaches to molecular systematics. 


1991 . Vane-Wright and colleagues raise novel issues about phylogenetic 
diversity and conservation value (see also Forey et al. 1994; Humphries 
et al. 1995). Falk and Holsinger edit a volume on conservation genetics 
in rare plants. 


1992 Avise empirically introduces a comparative perspective in conservation 
genetics for a regional biota. Hedrick and Miller address conservation 
biology notably from the vantage of genetic diversity and disease sus- 
ceptibility. Groombridge edits a taxonomic and genetic inventory of 
global biodiversity as a backdrop for conservation efforts. — 

1993 Thornhill edits an important volume on the natural history and conse- 
quences of inbreeding and outbreeding. 


1994 Avise publishes the first edition of this current textbook. Loeschcke and 
colleagues produce an edited volume on conservation genetics. Burke " 
edits a special issue of Molecular Ecology devoted to conservation genet- 
ics. Baker and Palumbi provide a powerful application of molecular 
forensics in monitoring endangered species products. 


1995 Ballou and colleagues edit a volume on genetic and demographic man- 
agement issues for small populations. 


1996 Avise and Hamrick, and, independently, Smith and Wayne, edit com- 
pendia of molecular studies in conservation genetics. Rhymer and 
Simberloff review the topic of genetic extinction via introgressive 
hybridization. 

1997 Hanski and Gilpin edit a volume on the metapopulation concept, 
including íssues of genetics, evolution, and extinction (see also Rhodes 
et al. 1996). 


1998 Allendorf edits a special issue of the Journal of Heredity devoted to con- J 
servation genetics of marine organisms. 


1999 . Landweber and Dobson edit a volume on genetics and species extinc- 
tions. Wildt and Wemmer review and preview reproductive technolo- 
gies (cloning, embryo transfer, etc.) in conservation biology. 

2000 The journal Conservation Genetics (Kluwer) is launched, Avise publishes 
the first textbook on phylogeography, a field with a genealogical slant 
on genetic variation and conservation. 


2002 = Frankham and colleagues publish the first introductory “teaching text- 
book” on conservation genetics. 


Source: After Avise 2004b. 
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Sixth mass extinction episode in its history (Leakey and Lewin 1995; Wilson 
1992), the only one caused by a living creature. Not since 65 million years 
ago, when a large asteroid slammed into the planet, has there been such a 
sudden negative impact on global biodiversity (Wilson 2002). 

One goal of conservation biology is to preserve genetic diversity at any 
and all possible levels in the phylogenetic hierarchy—that is, to save as 
much as possible of the Tree of Life (Mace et al. 2003). Another goal is to pro- 
mote the continuance of ecological and evolutionary processes that foster 
and sustain biodiversity (Bowen 1999; Crandall et al. 2000; Moritz 2002). 
Genetic drift, gene flow, natural selection, sexual selection, speciation, and 
hybridization are examples of natural and dynamic evolutionary processes 
that orchestrate how genetic diversity is arranged. 

This concluding chapter addresses the following question: How can 
molecular markers contribute to assessments of genetic diversity and natu- 
ral processes in ways that are serviceable to the field of conservation biolo- 
gy? The most general answer is simple: Molecular genetic tools help us to 
understand the nature of life. More specifically, molecular markers offer 
conservation applications in all of the topical areas that formed the organi- 
zational framework for this book, including assessments of genetic variation 
within populations, biological parentage, kinship, gender identification, 
population structure, phylogeography, wildlife forensics at various levels, 
speciation, hybridization, introgression, and phylogenetics. In this chapter 
we will revisit these topics in order, but now using illustrations that are 
especially germane to conservation efforts. 


Within-Population Heterozygosity Issues 


About 3076 of all publications in the field of "conservation genetics" have 
focused on how best to preserve variability within rare or threatened popu- 
lations (Avise 2004b). A common assumption underlying these studies is 
that higher mean heterozygosity (H, a measure of within-population genet- 
ic variation; see Chapter 2) enhances a population's survival probability 
over ecological or evolutionary time. Traditional approaches to heterozy- 
gosity assessment and management have been indirect. Management of H 
in captive populations (e.g., in 2008) often occurs de facto through breeding 
programs designed to avoid intense inbreeding, either by maintaining pop- 
ulations above some "minimum viable population size" or by exchanging 
breeding individuals among sites. For natural populations, some analogous 
management approaches have been to ensure adequate habitat such that 
local effective population sizes remain above levels at which inbreeding 
(and its associated fitness depression) becomes pronounced and to maintain 
habitat corridors that facilitate natural dispersal and gene flow among pop- 
ulations (Hobbs 1992; Simberloff and Cox 1987; but see Simberloff et al. 1992 
for a critical appraisal of corridor programs). Thus, concerns about inbreed- 
ing depression in small populations, both captive and wild, have motivated 
much of the work in conservation genetics (Hedrick and Kalinowski 2000). 
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With the advent of molecular techniques, more direct estimates of het- 
erozygosity were made possible. These estimates typically involve assays of 
multiple marker loci (such as allozymes or microsatellites). Such molecular 
heterozygosity estimates raise two major conservation-related issues: Is 
molecular variability reduced significantly in rare or threatened popula- 
tions? If so, is this reduction a cause for serious concern about the future of 
those populations? 


Molecular variability in rare and threatened species 


In the mid-1800s, indiscriminate commercial harvests of northern elephant 
seals (Mirounga angustirostris) reduced this formerly abundant species to dan- 
gerously low levels. Fewer than 30 individuals survived through the 1890s (all 
on a single remote island west of Baja California), but following legislative 
protection by Mexico and the United States, the species rebounded and now 
numbers tens of thousands of individuals, distributed among several rook- 
eries. Bonnell and Selander (1974) first surveyed 24 allozyme loci in 159 of 
these seals from five rookeries and observed absolutely no genetic variability, 
a striking finding given the high heterozygosities reported for most other 
species similarly assayed by protein electrophoretic methods. Several addi- 
tional molecular analyses have since confirmed and extended these findings 
of an exceptional paucity of genetic variation in northern elephant seals 
(Hoelzel 1999; Hoelzel et al. 1993, 2002b). Results cannot be attributed to 
recent phylogenetic legacy or to some other peculiarity of marine Pinnipedia 
because, in identical molecular assays, the closely related southern elephant 
seal (M. leonin) displayed normal levels of genetic variation (Slade et al. 1998). 

In recent decades, an isolated and endangered population of gray 
wolves (Canis lupus) on Isle Royale in Lake Superior declined from about 50 
individuals to as few as a dozen. Molecular studies then revealed that 
approximately 50% of allozyme heterozygosity had been lost relative to 
mainland samples (Wayne et al. 1991b). Furthermore, only a single mtDNA 
genotype remained on the island. In terms of multi-locus nuclear DNA fin- 
gerprints, these Isle Royale wolves were about as similar genetically as were 
full-sibling wolves in a captive colony, suggesting that the island population 
had become severely inbred (Wayne et al. 1991b). 

Hillis et al. (1991) employed allozyme markers to estimate genetic vari- 
ability in the Florida tree snail (Liguus fasciatus), many of whose populations 
are threatened or already extinct. Among 34 genes monitored in 60 individ- 
uals, only one locus was polymorphic, and mean heterozygosity was only 
0.002. Perhaps a population bottleneck accompanied or followed this 
species’ colonization of Florida from Cuba, or perhaps the snail’s habit of 
partial self-fertilization played a role in its loss of heterozygosity (at least 
within local populations). Surprisingly, the lack of appreciable variation at 
the allozyme level contrasts diametrically with this species’ exuberant mor- 
phological variability, especially with regard to genetically based shell pat- 
terns. Results highlight the fact that mean heterozygosity (as registered by 
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molecular markers) and magnitude of phenotypic variation (even that 
which is genetically encoded) are not necessarily similar. 

Genetic variation in remnant populations of the Sonoran topminnow 
(Poeciliopsis occidentalis) in Arizona, where the species is endangered, was 
compared at 25 allozyme loci with genetic variation in populations from 
Sonora, Mexico, where the fish is widespread and abundant. The peripheral 
populations in Arizona exhibited significantly less variation than did the 
Mexican populations near the center of the species' distribution (Vrijenhoek et 
al. 1985). The molecular analysis also revealed three major genetic groups 
within this species' range. The authors concluded that these three units 
should be maintained as discrete entities in nature because most of the over- 
all genetic diversity in P. occidentalis is attributable to these inter-group differ- 
ences. They also recommended that any restocking efforts in Arizona employ 
local populations whose mixing would increase within-population heterozy- 
gosity without eroding the genetic differentiation that characterizes the 
broader geographic assemblages. 

Another endangered species assayed extensively for molecular genetic 
variability is the cheetah (Acinonyx jubatus). The South African subspecies of 
this large cat was first surveyed at 47 allozyme loci, all of which proved to be 
monomorphic, and at 155 abundant soluble proteins revealed by two-dimen- 
sional gel electrophoresis, at which heterozygosity also proved to be low (H = 
0.013; O'Brien et al. 1983). Subsequent assays of more allozyme markers and 
of RFLPs at the major histocompatibility complex (MHC) gave further sup- 
port to the notion that this population is extremely genetically depauperate 
(O'Brien et al. 1985b; Yuhki and O'Brien 1990). Additional confirmation came 
from the fact that these cats fail to acutely reject skin grafts from "unrelated" 
conspecifics. The low molecular genetic variation documented in cheetahs 
cannot be attributed to some inherent property characteristic of all cats, 
because other species of Felidae often exhibit normal to high levels of genic 
heterozygosity in these same kinds of assays (O'Brien et al. 1996). Later sur- 
veys of rapidly evolving molecular systems (mtDNA and VNTR nuclear loci) 
did uncover modest genetic variation in cheetahs across their broader range, 
but the overall magnitude remained low, leading Menotti-Raymond and 
O'Brien (1993) to conclude that the heterozygosity present today could be due 
to post-bottleneck mutational recovery over a time frame of roughly 6,000 to 
20,000 years. O'Brien et al. (1987) proposed that the cheetah experienced at 
least two population bottlenecks: one approximately 10,000 years ago, prior to 
geographic isolation of the two recognized subspecies (which are highly sim- 
ilar genetically), and a second within the last century, which may have pro- 
duced the exceptional genetic impoverishment of the South African form. 

A similar scenario of bottleneck effects emerged regarding the Asiatic 
lion (Panthera leo persica), which now occurs as a remnant population in the 
Gir Forest Sanctuary in western India. Allozyme surveys (ca. 50 loci) detect- 
ed absolutely no variation in a sample of 28 individuals from this sub- 
species, whereas the Serengeti population of the African subspecies had 
much higher genetic variation (Wildt et a!. 1987). Similar results emerged 
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from DNA fingerprinting methods and analyses of MHC loci (O’Brien et al. 
1996). The relict group of lions in the Gir Forest is descended from a popu- 
lation that contracted to fewer than 20 animals in the first quarter of the 
twentieth century. The obvious interpretation is that this population reduc- 
tion profoundly affected genomic variation. Analogous scenarios of popula- 
tion bottlenecks and inbreeding in felids were documented for an isolated 
population of lions in Africa's Ngorongoro Crater and an isolated popula- 
tion of cougars (Puma concolor coryi) in North America's Florida Everglades 
(O'Brien et al. 1996) (see below). 

In a review of 38 endangered mammals, birds, fishes, insects, and 
plants, Frankham (1995) reported that 32 species (84%) displayed lower 
genetic diversity, as estimated by molecular markers (usually allozymes), 
than did closely related non-endangered species. A similar trend subse- 
quently was reported in DNA-level appraisals (VNTRs, RAPDs, AFLPs, or 
STRs) of threatened species versus their more common relatives (Frankham 
et al. 2002). Examples include endangered populations of such diverse crea- 
tures as the beluga whale (Delphinapterus leucas; Patenaude et al. 1994), black 
robin (Petroica traversi; Ardern and Lambert 1997), and burying beetle 
(Nicrophorus americanus; Kozol et al. 1994), and the plants Lysimachia minori- 
censis (Calero et al. 1999) and Cerastium fischerianum (Maki and Horie 1999). 
The magnitudes of these reductions in H were not invariably great, but there 
are many examples of rare or threatened populations that reportedly show 
extremely low molecular variation (some of these are listed in Table 9.1). In 
most of these instances, results were provisionally attributed to effects of 
genetic drift attending historical bottlenecks in population size.” 

On the other hand, many rare or endangered species have proved not to 
be unusually constrained in genetic variation. Examples of threatened 
species that have displayed more or less normal levels of molecular vari- 
ability include a federally protected spring-dwelling fish (Gambusia nobilis) 
endemic to the Chihuahuan desert (A. F. Echelle et al. 1989); Przewalski’s 
horse (Equus przewalskit), which is extinct in the wild but survived by sever- 
al hundred animals in zoos (Bowling and Ryder 1987); the endangered man- 
atee (Trichechus manatus) in Florida (Garcia-Rodriguez et al. 1998; 
McClenaghan and O’Shea 1988); and Stephens’ kangaroo rat (Dipodomys 
stephensi), whose historical range is restricted to interior coastal valleys of 
southern California (Metcalf et al. 2001). Additional examples include sev- 
eral endangered avian species (see review in Haig and Avise 1996): a flight- 
less parrot (Strigops habroptilus) native to New Zealand (Triggs et al. 1989), 
the American wood stork (Mycteria americana; Stangel et al. 1990), and most 
populations of the red-cockaded woodpecker (Picoides borealis) in the south- 
ern United States (Stangel et al. 1992). 

In theory, the demographic details of population bottlenecks (such as 
their size, duration, and periodicity) should exert important influences on the 
severity of expected reductions in neutral genetic variability (Luikart et al. 
1998). For example, the loss in mean heterozygosity can be minimal if popu- 
lation size increases rapidly following a single bottleneck of short duration 








RI A I AS anes | 


482 Chapter o 


TABLE 9.1 Examples of rare or endangered species with exceptionally low genetic variability, 4 
as documented by multi-locus molecular methods ^ 


Species Observation 
SM—— aec accu OR eS 


Plants 
Bensoniella oregona 


Eucalyptus phylacis 


Harperocallis flava 


Howellia aquaticus 


Pedicularis furbishiae 
Posidonia oceanica 
Saxifraga cernua 


Trifolium reflexum 


Animals 
Bison bíson 


Castor fiber 


Monachus 
schauinslandi 
Mustela nigripes 


Perameles gunnii 


Strix occidentalis 


Complete absence of allozyme variation (24 loci) within or among 
populations of this endemic herbaceous perennial in southwest 
Oregon and northwest California. , 

The last remnant stand of this Australian tree is so depauperate in genetic 
variation as to consist in effect of a single clone. Another closely 
related local endemic was modera tely variable, however. 

This endemic to the Apalachicola lowlands of the Florida Panhandle was 
monomorphic at all 22 allozyme loci scored, in sharp contrast to high 
genetic variation in a related lily species widespread in that region. 

Complete absence of allozyme variation (18 loci) within or among 
populations of this rare and endangered aquatic plant in the 
Pacific Northwest. 

Complete absence of allozyme variation (22 loci) within or among pop- 
ulations of this endangered hemiparasitic lousewort in northern Maine. 

^ population of this seagrass in the northern Adriatic Sea appears to be 
in effect a single clone, as gauged by microsatellite assays. - 

Complete absence of variation in RAPD or allozyme markers within 
each of several glacial relict populations in the European Alps. 

Complete absence of allozyme variation (14 loci) in the only known pop- 
ulation of this rare native clover in Ohio; however, allozyme assays 
(20 loci) of an endangered congener, T. stoloniferum, did reveal 
moderate levels of genetic variation. 


Only one allozyme locus (among 24 tested) was polymorphic in a bison 
herd in South Dakota known to be descended from a small founder 
group; other bison herds show microsatellite heterozygosities that are 
correlated with numbers of founding animals. 

Scandinavian beavers, which went through a severe bottleneck in the 
1800s due to overhunting, now show extremely low variation at DNA 
fingerprinting and MHC loci; nonetheless, the population recovered 
and expanded tremendously during the twentieth century. 

Populations of this critically endangered Hawaiian monk seal citer 
extremely low genetic variation in nuclear DNA fingerprints and mtDNA. 

Only one allozyme locus (among 46 tested) was polymorphic in the Mn 
known remaining population of the highly endangered black-foote 
ferret; microsatellite variation was also considerably reduced. 

Complete absence of allozyme variation (27 loci) within an endangered, 
isolated population of the eastern barred bandicoot in Australia . 
(however, a widespread and dense population of the same species in 
Tasmania also lacked genetic variation at these same loci). 

Complete absence of allozyme variation (23 loci) in six populations of 


the endangered spotted owl from Oregon and California. 
M — he endangered spotted owl from Oregon and California. — — ç — 
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Reference 

pM 
P. S. Soltis et al. 1992b 

Rossetto et al. 1999 

Godt et al. 1997 


Lesica et al. 1988 


Waller et al. 1987 
Ruggiero et al. 2002 
Bauert et al. 1998 


Hickey et al. 1991 


McClenaghan et al. 1990; Wilson and Strobeck 1999 
Ellegren et al. 1993 


Kret«mann et al. 1997 


O'Brien et al. 1989; Wisely et al. 2002 


Sherwin ct al, 1991 


Barrowclough and Gutiérrez 1990 


———— a 
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(Nei et al. 1975). An empirical example of a severe population reduction that, 
for suspected demographic reasons, did not result in low heterozygosity 
involves the endangered one-horned rhinoceros (Rhinoceros unicornis). Prior 
to the fifteenth century, about half a million of these animals ranged across a 
broad area from northwestern Burma to northern Pakistan. Land clearing and 
human settlement then began to fragment and destroy rhino habitat, and by 
1962 fewer than 80 animals remained, all in what is now Nepal's Royal 
Chitwan Park. Surprisingly, this herd proved to exhibit one of the highest 
allozyme heterozygosity values reported for any vertebrate: H = 0.10 
(Dinerstein and McCracken 1990). One possibility is that loss of rhino habitat 
across the Indian subcontinent compressed surviving populations into the 
Chitwan area, thereby concentrating into a single locale considerable genetic 
variation that formerly had been distributed among regions. 

The particular molecular markers employed can also dramatically influ- 
ence estimates of genetic heterozygosity. Most of the early studies used 
allozymes (see Table 9.1), but multi-locus microsatellite appraisals have been 
employed increasingly in recent years to reassess within-population variation 
and relate it to population demography and fitness components potentially 
associated with inbreeding depression (e.g., Coltman et al. 1998a,b; Coulson 
et al. 1998; Hedrick et al. 2001; Pemberton et al 1999; Rossiter et al. 2001; Slate 
et al. 2000). However, STR loci have high mutation rates and tend to recover 
genetic variation quickly, so the molecular footprints of population bottle- 
necks on these loci should be less long-lasting than on allozyme loci. 


Does reduced molecular variability matter? 


The examples cited above indicate that genic heterozygosity is indeed 
reduced in many (though certainly not all) rare or threatened populations 
and species. Do these findings carry any special significance for conserva- 
tion efforts? Although it is tempting to assume that a paucity of genetic vari- 
ation jeopardizes a species’ future, the goal of firmly documenting a causal 
link between molecular heterozygosity and population viability remains 
elusive (see Chapter 2). In general, there are several reasons for exercising 
caution in interpreting low molecular heterozygosities reported for rare 
species: most of the reductions in genetic variation presumably have been 
outcomes, rather than causes, of population bottlenecks; at least a few wide- 
spread and successful species also appear to have low H values as estimat- 
ed by molecular methods; and in some endangered species (such as the 
northern elephant seal), low genetic variation has not seriously inhibited 
population recovery from dangerously low levels (at least to the present). 
Another point is that the fitness costs of inbreeding (Box 9.2) are known 
to vary widely among species (and often even among conspecific populations; 
eg, Kärkkäinen et al. 1996; Montgomery et al. 1997), Thus, some taxa are 
highly susceptible, but others relatively immune, to fitness depression effects 
from consanguineous matings (Frankham et al. 2002; Laikre and Ryman 1991; 
Price and Wasser 1979; Ralls et al. 1988). Furthermore, inbreeding depression 
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BOX 9.2 Inbreeding Depression 





Inbreeding depression is the decrease in growth, survival, or fertility often 
observed following matings among relatives. The phenomenon is of special con- 
cern in conservation biology because inbreeding is likely to be severe in small 
populations. Genetically inbred populations have reduced heterozygosity. 
(increased homozygosity) due to increased probabilities that individuals carry 
alleles that are identical by descent (stem from the same ancestral copy in earlier 
generations of a pedigree). This probability for an individual I is the inbreeding 
coefficient, which for known pedigrees can be calculated as 


= 2(1/2)' (1 + Fy) 


where the summation is over all possible paths through all common ancestors, i 
is the number of individuals in each path, and A'is the commen ancestor in each 
path (for computational details, see Ballou 1983 and Boyce 1983). 

Two competing hypotheses for the genetic basis of inbreeding depression 
have been debated for decades (see Charlesworth and Charlesworth 1987). 
Under the “dominance” scenario, lowered fitness under inbreeding results 
from particular loci being homozygous for otherwise rare deleterious recessive 
alleles, which in outbred populations are usually masked in expression (in het- 
erozygotes) by their dominant counterparts. Under the competing “overdomi- 
nance” or “heterozygous advantage” scenario, genome-wide heterozygosity. 
per se is the critical influence on fitness. Recent literature seems to indicate 
considerable support for the dominance model, with overdominance (includ- 
ing epistasis) also contributing to inbreeding depression asa secondary factor 
(e.g., Carr and Dudash 2003). 

The dominance and overdominance hypotheses make different predictions 
about the relative tolerance of populations to inbreeding (Lacy 1992). If delete- 
rious recessive alleles cause inbreeding depression, then selection will be-more 
likely to have removed most such alleles from populations that have long his- 
tories of inbreeding. All else being equal, such populations should rebound 
and for a while be resistant to further inbreeding effects. In other words, under 
the dominance hypothesis, populations that survive severe inbreeding may. be 
temporarily “purged” of deleterious recessive alleles, and mean heterozygosity. 
(as estimated, for example, by molecular markers) should therefore have little 
general predictive value of a population's genetic health. (However, the fre- 
quencies of genes of large ánd small effect and other population genetic factors 
can also have important effects on the extent to which Inbreeding depression is 
purged; see Byers and Waller 1999; Crnokrak and Barrett 2002.) On the other 
hand, if inbreeding depression occurs because of a selective advantage to 
genome-wide heterozygosity (the overdominance hypothesis), then inbred 
(and homozygous) populations should show reduced fitness, and under future 
inbreeding might be expected to fare no better than would highly heterozy- 
gous populations. 

Whatever the mechanistic explanation, different populations émpirically 
exhibit widely varying fitness costs associated. with inbreeding. For example, in 
a survey of captive populations of a variety of mammalian species, relative 
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reduction in survival in crosses between first-degree relatives (such as full sibs) 
varied across more than two orders of magnitude (Ralls et al. 1988, as summa- 
rized by Hedrick and Miller 1992): 





Species Cost of inbreeding 
Sumatran tiger (Panthera tigris sumatrae) <0.01 
Bush dog (Speothos venaticus) 0.06 
Short bare-tailed opossum (Monodelphis domestica) 0.10 
Gaur (Bos gaurus) 0.12 
Pygmy hippopotamus (Choeropsis liberiensis) 0.33 
Greater galago (Galago c. crassicaudatus) 0,34 
Dorcas gazelle (Gazella dorcas) 0.37 
Elephant shrew (Elephantulus refuscens) 0.41 
Golden lion tamarin (Leontopithecus r. rosalia) 0.42 
Brown lemur (Lemur fulvus) 0.90 


Because inbreeding depression typically is weaker in captive than in natural 
populations, these values should perhaps be considered minimal estimates of 
what may occur in the wild. Many additional examples are discussed by 
Frankham et al. (2002), who also provide extended discussions of the causes 
and consequences of inbreeding depression in a conservation context. 


seems to have a “stochastic nature” (Frankham et al. 2002), which is reflected 
in varying outcomes depending on the particular fitness components exam- 
ined in a given species (Lacy et al. 1996), as well as in hit-or-miss expressions 
of the phenomenon in diverse organismal groups (ranging from birds and 
mammals to other vertebrates, invertebrates, and plants; Crnokrak and Roff 
1999). For all of these reasons, caution is indicated in drawing firm universal 
conclusions about levels of molecular variation as they might relate to a pop- 
ulation’s susceptibility to extinction. 

A further concern about interpreting the evolutionary significance of 
molecular variation is that published estimates based on any single class of 
markers (such as allozymes or microsatellites) may inadequately character- 
ize genome-wide heterozygosity (Hedrick et al. 1986), including quantita- 
tive variability that may underlie morphological or physiological traits of 
potential adaptive significance (Pfrender et al. 2000). For example, in a 
recent literature review of microsatellite heterozygosity values, Coltman 
and Slate (2003) concluded that available estimates of variation at STR loci 
are only weakly correlated with phenotypic or life history variation, and 
that far larger sample sizes (typically more than 600 individuals per study) 
will have to be employed in the future to detect any statistically robust rela- 
tionships that might exist with inbreeding depression. Years earlier, Carson 
(1990) already had gone even further in suggesting that “genetic variance 
available to natural selection may actually increase following a single severe 
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bottleneck” and that “character change in adaptation and speciation may, in 
some instances, be promoted by founder events.” These conclusions 
stemmed from observations and experiments with bottlenecked popula- 
tions of fruit flies and house flies. 

From these and additional considerations, the argument has been made 
that demographic, ecological, and behavioral considerations should often be 
of greater immediate importance than genetic (i.e., heterozygosity) issues in 
the formulation of conservation plans for endangered species (Lande 1988, 
1999). For example, individuals in many species show decreased reproduc- 
tion at low population densities because of a lack of the social interactions 
that are necessary for breeding, difficulties in finding a mate, or other den- 
sity-dependent behavioral and ecological factors collectively known as 
“Allee effects” (Andrewartha and Birch 1954). Furthermore, when popula- 
tions are few in number and small in size, the possibility of species extinc- 
tion through "stochastic" demographic fluctuations (irrespective of levels of 
molecular heterozygosity) can be of paramount immediate concern (Gilpin 
and Soulé 1986; Hanski and Gilpin 1997). 

On the other hand, some authors have forcefully argued that heterozy- 
gosity, as measured by molecular markers, is important to a population's 
health and continued survival and should be monitored accordingly in 
enlightened management programs (e.g, O'Brien et al. 1996; Vrijenhoek 
1996). In case studies involving a variety of taxa, plausible arguments have 
been advanced for rather direct associations between observed molecular 
variability and the viability of an endangered taxon. For example, in Isle 
Royale's population of gray wolves, Wayne et al. (1991b) speculated that an 
observed behavioral difficulty in adult pair bonding might be due to a recog- 
nition-triggered instinct for incest avoidance (because molecular data sug- 
gested that these wolves were highly inbred). For the isolated and genetically 
uniform Gir Forest population of Asiatic lions, O'Brien and Evermann (1988) 
concluded that high frequencies of abnormal spermatozoa and diminished 
testosterone levels in males, relative to lions of the African Serengeti, were 
attributable to intense inbreeding (because similar damaging effects on sperm 
development have been observed in inbred mice and livestock) In the 
Sonoran topminnow, Quattro and Vrijenhoek (1989) experimentally moni- 
tored several fitness components (survival, growth, early fecundity, and 
developmental stability) in laboratory-reared progeny of fish collected from 
nature. All of these fitness traits proved to be positively correlated with mean 
allozyme heterozygosities in the populations from which the parents origi- 
nated. In endangered plain pigeons (Columba inornata wetmorei) of Puerto 
Rico, four measures of reproductive fitness (total eggs, fertile eggs, number of 
hatchlings, and number of fledglings) were significantly correlated with 
genetic variation as measured in DNA fingerprints (Young et al. 1998). In the 
greater prairie chicken (Tympanuchus cupido), a wild population that had expe- 
rienced an extreme demographic contraction not only lost heterozygosity as 
measured by STR loci, but also suffered a significant decline in hatching rates 
(Bouzat et al. 1998). 


| 
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Perhaps the most intriguing case for a causal link between inbreeding, 
low molecular heterozygosity, and diminished genetic fitness involves chee- 
tahs. As mentioned above, there is multifaceted molecular evidence for 
severely reduced heterozygosity in this species, including at MHC genes 
that encode cell surface antigens involved in the immune response. Several 
years ago, a disease (feline infectious peritonitis, or FIP, caused by a coron- 
avirus) swept through several captive cheetah colonies and caused 
50%-60% mortality over a 3-year period. This same virus in domestic cats 
(which have normal levels of MHC variation, as indicated by graft rejections 
and molecular assays) has an average mortality rate of only 1%. O'Brien and 
Evermann (1988) speculate that an FIP virus might have acclimated initial- 
ly to one cheetah, then spread rapidly to other individuals who were genet- 
ically uniform in their immunological defenses. In general, enhanced sus- 
ceptibility to infectious diseases or parasitic agents probably constitutes one 
of the most serious challenges faced by a population with low genetic vari- 
ation (O’Brien and Evermann 1988). 
Interestingly, emphasis on the special adaptive significance of immuno- 
recognition genes led to a suggestion that captive breeding programs for 
endangered species be designed with the specific goal of maintaining diversi- 
ty at MHC loci, because individuals heterozygous at all or most MHC loci i 
would be protected against a wider variety of pathogens than homozygous i 
specimens (Hughes 1991). Furthermore, according to Hughes (1991), “at most 
loci loss of diversity should not be a cause for concern, because the vast major- 
ity of genetic polymorphisms are selectively neutral.” This suggestion was 
immediately criticized on several grounds (Gilpin and Wills 1991; Miller and 
Hedrick 1991), not least of which was its assumption that variability at genes 
other than the MHC is adaptively irrelevant. The critics argued that several 
other loci are known to contribute to disease resistance itself, and that poly- 
genes underlying numerous other quantitative traits of potential adaptive rel- 
evance should not be cavalierly ignored (Vrijenhoek and Leberg 1991). ‘ 
Furthermore, selective breeding designed explicitly to maintain MHC diver- | 
sity could have a counterproductive consequence: accelerated inbreeding, | 
with concomitant accelerated loss of diversity elsewhere in the genome. i 
In conclusion, especially when large numbers of loci are monitored and i 
multiple assays are performed (e.g., of allozymes, MHC loci, and microsatel- i 
lites), molecular markers can provide quite reliable estimates of genome- 
wide heterozygosity, which in turn are theoretically interpretable in terms of 
historical effective population sizes (Box 9.3). Thus, molecular analyses can 
help to identify natural or captive populations that display severe genetic 
impoverishment from past population bottlenecks or inbreeding. Less clear 
(except perhaps in extreme cases, such as in cheetahs) is the extent to which | 
molecular heterozygosity is a reliable gauge of a population's short-term | 
survival and long-term adaptive potential. Thus, managing captive or natu- 
ral populations for genetic heterozygosity per se should not come at the ! 
expense of neglecting important behavioral, ecological, or environmental i 
factors. Usually, however, all such considerations are mutually reinforcing. 
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BOX 9.3 Effective Population Size in a Conservation 
Context 





The concept of effective population size (see Box 2.2) is relevant to many con- 
servation efforts, For example, in the early literature of conservation genetics, 
Frankel (1980) and Soulé (1980) suggested that a minimum effective population 
size of 50 individuals would be required to stem inbreeding depression, and 
Frankel (1980; Frankel and Soulé 1981) added that an effective population size 
of 500 would prevent the long-term erosion of variability by genetic drift. 
These specific management guidelines became known as the 50/500 rule, and 
at first they were taken quite literally in some conservation practices 
(Simberloff 1988). Later, Lande (1995) and Lynch and Lande (1998) concluded 
from further theoretical considerations that these numbers wete about one or 
two orders of magnitude too small (see also Frankham et al. 2002). AH such 
recommendations are assumption-laden and provide only crude guídelines for 
population management (Varvio et al. 1986), but they do illustrate the kind of 
attention that has been devoted to estimating N, in a conservation context. 

As described in Chapter 2, average molecular heterozygosity (H) and 
long-term effective population size (evolutionary N,) are interrelated under 
models that assume selective neutrality for genetic variation. If average muta- 
tion rates for particular classes of molecular markers are known or assumed, 
and if the current standing crop of variation in those markers has been 
assayed in a given extant population, results can be translated into provisional 
estimates of evolutionary N, for that population by several statistical proce- 
dures (Estoup and Angers 1998; Luikart and England 1999). In one application 
of this approach involving a population of Tanzanian leopards (Panthera par- 
dus), Spong et al. (2000) assayed genetic variation at 17 microsatellite loci and 
then converted the resulting expected heterozygosity values (assuming 
Hardy-Weinberg equilibrium) to estimates of aaa ha effective popula- 
tion size using the formula 4 ; 


N, = (1 / {1 - Hg? -1)/ 8p 


(see Lehmann et al. 1998). When a microsatellite mutation rate of p = 2 x 104 
was assumed, these molecular data suggested that long-term N was about 
40,000 for leopards in this region of Africa. 

Molecular data from extant populations can also be used to estimate tem- 
poral historical demographies of populations. The general philosophy urlderly- 
ing thís approach was introduced in the discussions of lineage sorting and coa- 
lescent theory in Chapters 2 and 6. One approach involves examining mis- 
match frequency distributions (i.e., pairwise genealogical distances between 
extant individuals as estimated from molecular markers): Different kinds of 
population histories are predicted to leave different types of footprints on these 
mismatch distributions. For example, a rapid population expansion in the past 
theoretically "makes a wave" (Rogers and Harpending 1992) in the distribution 
because many extant lineages will have traced back (coalesced) to that approxi- 
mate time of evolutionary expansion, thereby making a peak or wave in the 
mismatch histogram coinciding with that historical period. 

A related coalescent approach utilizes lineages-through-time plots analo- 
gous to the one presented in Figure 7.4, except that such lineages in the current 
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context are those in intraspecific genealogies (derived from mtDNA or from 
genetic distance analyses of multiple nuclear loci), rather than branch lengths in 
species phylogenies. If the number of intraspecific lineages has neither increased 
nor decreased disproportionately in a population's recent evolutionary history, a 
lineages-through-time plot should be roughly linear. However, the plot should 
be concave upward or concave downward, respectively, if the population has 
experienced recent accelerated or decelerated growth (Nee et al. 1996a). By 
applying this type of lineage-based reasoning to their genetic data, Lehmann et 
al. (1998) were led to conclude that the Tanzanian leopard population mentioned 
above had been large and roughly stable in size over recent evolutionary time. 

Such marker-based appraisals of the temporal dynamics of population 
demographic history now appear quite often in the literature (Lavery et al. 1996; 
Nee et al. 1996a,b; Rooney et al. 1999), but the results must be interpreted with 
great caution because they rest critically on several underlying assumptions: 
that the focal population has been genetically closed and spatially unstructured; 
that mutation rate estimates are reliable; that genealogies are estimated with 
considerable accuracy; and that the particular molecular markers employed are 
appropriate for the ecological or evolutionary time frame presumably covered 
by the analysis. Thus, marker-based estimates of historical population sizes and 
their temporal dynamics normally have major uncertainties and wide biological 
confidence limits (Hedrick 1999; Marjoram and Donnelly 1997). 

Molecular markers can also be employed to estimate generation-by-gener- 
ation N, in modem or contemporary time. One such method requires that neu- 
tral allele frequencies be monitored across two or more generations. Then, 
effective population size for that time interval can be estimated by statistical 
procedures that relate N, to any of several molecular outcomes that might be 
empirically observed: reductions in heterozygosity due to inbreeding; changes 
in allele frequencies due to genetic drift; or rates of decay of linkage disequilib- 
rium among loci. Details of these and similar methods can be found in i 
Frankham et al. (2002), Neigel (1996), and Schwartz et al. (1998). Apart from i 
the conservation relevance of numerical abundance per se (because rarity is 
often an indicator of extinction vulnerability), there are other genetic reasons 
for interest in contemporary population size: advantageous mutations are far 
more likely to arise in large than in small populations; and natural selection's 
influence (relative to drift) is theoretically stronger in larger populations. 
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Tagging technique Universal? Heritable? Permanent? i 
Molecular markers Yes Yes Yes | 
Natural phenotypic markings No Yes or no Yes or no 
Human-applied physical tags No No Yes or no 

Source: Modified from Palsboll 1999. 








| Molecular Markers in Conservation Genetics 491 


For example, management programs for rare or endangered species often 
promote heterozygosity preservation de facto through captive breeding pro- 
grams designed to avoid close inbreeding or via habitat preservation that 
promotes larger effective population sizes in the wild. So, genetic heterozy- 
gosity issues are typically just one of several reinforcing elements in a nexus 
of management considerations for endangered taxa. 


Genealogy at the Microevolutionary Scale 


Although the heterozygosity and inbreeding issues discussed above have 
often dominated discussions in conservation genetics, molecular markers 
are perhaps even better suited for forensic, genealogical, and phylogenetic 
appraisals of organisms in various conservation contexts, at both micro- and 
macroevolutionary scales. This section begins to address various microevo- 
lutionary applications of molecular genetic techniques in conservation. 


Tracking individuals in wildlife management 


Wildlife movements traditionally are monitored through the use of physical 
devices attached by researchers (such as fin tags placed on fish, leg bands on 
birds, or radio collars on mammals) or by field observations of individuals 
that are distinguishable by natural phenotypic markings (such as variable 
color patterns or scars on whales). Protein and DNA molecules can provide 
specimen tags too, and they offer several advantages over traditional tracking 
methods (Table 9.2): all individuals in all species come ready-made with these 
natural labels; genetic tags are transmitted across generations under specifi- 
able modes of inheritance; and modern laboratory techniques (notably PCR) 
make it routinely possible to obtain genotypes non-destructively and non- 
invasively (from shed hair, feathers, eggshells, feces, etc.), sometimes without 
the need to handle or even see the animals analyzed (Morin and Woodruff 
1996; Taberlet and Luikart 1999). Another advantage is that population genet- 
ic variation in sexual species is normally so high that molecular markers from 
multiple hypervariable loci are expected to distinguish individuals with high 
probability (Paetkau and Strobeck 1994). 
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In genetic studies of humpback whales (Megaptera novaeangliae), 
Palsbeli et al. (1997b) "ground-truthed" the individual-diagnostic power of 
molecular markers by comparing identifications from genotypes at six 
polymorphic microsatellite loci with those from natural markings (pig- 
mentation patterns and scar marks) known to distinguish particular speci- 
mens. More than 3,000 samples for molecular analysis came either from 
skin biopsies or from sloughed skin collected from free-swimming whales. 
The molecular findings not only confirmed the individual-diagnostic 
power of these genetic markets, but also supported earlier assumptions 
about the animals’ annual migration routes and their patterns of site fideli- 
ty to specific summer feeding grounds. DIM Mes, 

Comparable studies have been conducted on large land animals. In a 
study of mountain lions (Puma concolor) in Yosemite Valley, California, 
Ernest et al. (2000) "tracked" individual animals by analyzing multi-locus 
genotypes from fecal samples collected along hiking trails, and also com- 
pared these genotypes with those from animal tissues sampled at necrop- 
sy. One of the largest molecular studies of this sort involved black bears 
(Ursus americanus) and brown bears (U. arctos) in western Canada, for 
which Woods et al. (1999) used “hair traps" (barbed wire attached to trees) 
to collect nearly 2,000 hair samples from the wild. Following PCR, these 
samples were genotyped at the mtDNA control region, at a Y chromosome 
segment, and at six autosomal microsatellite loci. Each genotypic match 
was deemed to be a repeat collection from the same individual, whereas 
distinct genotypes clearly came from different specimens. Through these 
genetic analyses, the authors were able to identify the species, sex, and 
individuality of each sample without the need to capture or even observe 
these free-ranging animals directly. Molecular studies on European bears 
have achieved similar success using PCR-amplified DNA from fecal sam- 
ples (Kohn et al. 1995; Taberlet et al. 1997). 

Apart from obviating the need to disturb (or be disturbed by) such large 
and difficult-to-observe animals, another rationale for this kind of individ- 
ual tracking by non-invasive molecular markers is to estimate current pop- 
ulation size. For example, based on microsatellite analyses of coyote (Canis 
latrans) fecal samples systematically collected within a 15-km? region of the 
Santa Monica Mountains near Los Angeles, California, Kohn et al. (1999) 
estimated a population size of N = 38 for these otherwise difficult-to-count 
animals. 





Parentage and kinship 


Multi-locus genotypes obtained by non-invasive sampling can also be used in 
molecular studies of genetic paternity and kinship. For example, Morin et al. 
(1994a,b) used hair samples, and Gerloff et al. (1995) used fecal samples, as 
DNA sources for PCR-based assessments of genetic relatedness in wild chim- 
panzees (Pan troglodytes). Such genetic appraisals often have conservation or i 
management relevance. For example, microsatellite paternity analyses in one 
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group of captive chimpanzees showed that the dominant male had sired most 
(but not all) offspring (Houlden et al. 1997). If these results are typical, they 
suggest that particular males might generally dominate reproduction in cap- 
tivity, in which case N, in each zoo population could be far lower than might 
otherwise have been assumed. By contrast, a similar marker-based study of 
wild chimpanzees revealed that slightly more than 50% of analyzed offspring 
had been fathered by males from outside the focal group (Gagneux et al. 
1997). Thus, managers might wish to consider the occasional exchanges of 
male chimpanzees (or at least their sperm for artificial insemination) between 
zoos, both to diminish inbreeding within captive colonies and perhaps to 
mimic natural behavioral and genetic conditions more dosely — .- —- — 

Longmire et al. (1992) used DNA fingerprints to assess paternity with- 
in a small captive flock of endangered whooping cranes (Grus americana) 
that had been maintained at the Patuxent Wildlife Center in Maryland since 
1965. Because of this species' long generation time and low reproductive 
output, crane husbandry in captivity requires substantial human invest- 
ment. Thus, in attempts to maximize the captive flock's efficiency in pro- 
ducing fertile eggs, Patuxent researchers sometimes artificially inseminat- 
ed adult females with semen from several males. However, this procedure 
also made paternity uncertain, with the undesirable consequence that 
breeding plans based on maximizing N, (i.e., avoiding inbreeding) were 
compromised. The molecular genetic study rectified this situation by pro- 
viding a posteriori knowledge of biological parentage that in turn could be 
used as pedigree information in the design of subsequent matings within 
the flock. A later analysis of microsatellite DNA extended this approach to 
additional captive and reintroduced wild populations of whooping cranes 
(Jones et aJ. 2002). 

Molecular data on parentage and kinship have been used to verify breed- 
ing records and correct “studbook” errors for several captive or reintroduced 
endangered species, such as the Waldrapp ibis (Geronticus eremita; Signer et al. 
1994) and Arabian oryx (Oryx leucoryx; Marshall et al. 1999). Such analyses can 
also be used to help decide which specific individuals in a captive population 
of known pedigree should have breeding priority when the goal is to maxi- 
mize population genetic variation (Haig et al. 1990; Hedrick and Miller 1992). 
For this purpose, two explicit breeding guidelines have been suggested and 
sometimes implemented: crosses should be designed to equalize expected 
genetic contributions from the original founder individuals (Lacy 1989); and 
reproduction by individuals who are deemed to have special genetic signifi- 
cance by virtue of their outlying genealogical positions in the pedigree should 
be emphasized (Geyer et al. 1989). 

Marker-based studies related to parentage and kinship can often provide 
conservation-relevant information on managed non-captive populations as 
well. One example involves the Mauna Kea silversword (Argyroxiphium sand- 
wicense), which, due to grazing by alien ungulates, had been reduced by the 
late 1970s to a single remnant population on the island of Hawaii. To improve 
the species' prospects for survival, beginning in 1973, several hundred plants 
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were intermittently introduced to other sites, but it turned out that all of these 
outplants were probably first- or subsequent-generation offspring of only two 
maternal founders. An analysis of molecular markers from 90 RAPD loci indi- 
cated that detectable polymorphism had decreased by nearly 75% during this 
reintroduction program (Robichaux et al. 1997). 

An interesting application of genetic parentage analysis involving both 
captive and wild stocks has been conducted on steelhead trout, Oncorhynchus 
mykiss. In 1996, at Forks Creek, Washington, hatchery-raised fish were 
released into the wild to supplement natural stocks that were in decline. 
Later, through parentage analyses based on microsatellite markers, McLean 
et al. (2003) showed that hatchery-raised females had produced only about 
one-tenth as many surviving adult progeny per capita as had wild females. 
The poor performance of the hatchery-reared fish may have been an inad- 
vertent consequence of their domestication, or of their having spawned too 
early in the winter season, or both, but in any event, the results indicated the 
near-futility of this management scheme (at least at this site). 

Perhaps the greatest value of molecular parentage and kinship analyses 
in conservation efforts lies in their ability to reveal otherwise unknown 
aspects of the reproductive biology and natural history of threatened (and 
other) species in the wild. A case in point involves the rare blue duck 
(Hymenolaimus malacorhynchos), which inhabits isolated mountain streams in 
New Zealand. From patterns of band sharing in DNA fingerprints, new 
insights emerged about this species’ probable family-unit structure. 
Significantly higher genetic relatedness was documented within than 
between blue duck populations from different rivers, probably due to a social 
system that includes limited dispersal from natal sites and frequent matings 
among relatives (Triggs et al. 1992), as well as source-sink population dynam- 
ics (King et al. 2000). Another example involves mating patterns in marine 
turtles, about which little was formerly known because these animals mate at 
sea and individual specimens are difficult to distinguish visually. Molecular 
paternity analyses of several populations of these endangered species have 
revealed varied percentages (ca. 10%~60%) of clutches that were the result of 
successful multiple mating by females (Bollmer et al. 1999; Crim et al. 2002; 
Fitzsimmons 1998; Harry and Briscoe 1988; Kichler et al. 1999; Parker et al. 
1996). Furthermore, maternity analyses based on mtDNA have revealed much 
new information about individuals’ patterns of yearly as well as lifetime 
migration between nesting sites and foraging grounds (see below). 

In one final example, the northern hairy-nosed wombat (Lasiorhinus 
krefftii) may be Australia's most endangered marsupial, but its fossorial and 
nocturnal habits make it difficult to observe in the wild. Genotypes at nine 
microsatellite loci were scored in more than two-thirds of the 85 known 
individuals of this species. The results unexpectedly showed that both 
males and females were significantly more closely related to their same-sex 
burrow companions than to random individuals from the population, 
whereas opposite-sex burrow companions were not closely related (Taylor 
et al. 1997). Results suggest that wombats may have dispersal mechanisms 


Molecular Markers in Conservation Genetics 495 


that lead to associations of same-sex relatives, but that these mechanisms do 
not lead to a high incidence of inbred matings because close relatives of 
opposite gender are not significantly associated in space. 


Gender identification 


Many endangered species in captive breeding programs are sexually non- 
dimorphic in their visible features, so sex-linked molecular markers can 
play a key role in identifying gender. In the critically endangered black stilt 
(Himantopus novaezelandiae) of New Zealand, knowledge gained through 
molecular sexing was used to avoid same-sex pairings in a recovery pro- 
gram that began with only about a dozen adult birds (Millar et al. 1997). 
Gender-specific molecular markers (notably on the female-specific W chro- 
mosome) likewise have been used to sex individuals in captive breeding 
programs for the critically endangered Taita thrush (Turdus helleri; Lens et al. 
1998), Norfolk Island boobook owi (Ninox novaeseelandiae undulata; Double 
and Olsen 1997), and others. DNA from molted feathers was PCR-amplified 
and used to determine the sex of the last Spix’s macaw (Cyanopsitta spixii) in 
the wild (Griffiths and Tiwari 1995). The individual turned out to be a male, 
so a captive female was released as a prospective mate. By establishing the 
gender of otherwise unknown individuals in any population, molecular 
markers can also provide information suitable for estimating effective pop- 
ulation size, because the relative number of breeding adults of the two gen- 
ders is an important determinant of N, (see Box 2.2). 

An interesting application of molecular sexing to a wild endangered 
population involved the Seychelles warbler (Acrocephalus sechellensis; 
Komdeur et al. 1997). Each breeding pair of this non-dimorphic species 
occupies the same territory for as long as 9 years, producing one clutch per 
year with a single egg. A daughter (but seldom a son) often remains with the 
parents for 2 to 3 years, supposedly helping to raise later offspring. Indeed, 
field studies have shown that these helpers do increase the survival chances 
of nestlings in high-quality parental territories, but not in poor-quality ones, 
where the helpers do more harm than good (probably by competing for lim- 
ited resources). By molecular sexing of baby birds from many nests, 
researchers discovered that adult warblers on high-quality territories pro- 
duced mostly stay-home daughters, whereas adult warblers on poor-quali- 
ty territories produced mostly leave-home sons. The authors interpreted 
these genetic findings and field observations as evidence that parent 
Seychelles warblers must have some (unknown) means for adaptively 
adjusting the sex ratio of their progeny to environmental conditions. 


Estimating historical population size 


As mentioned in Chapter 2, most high-gene-flow species with large num- 
bers of living individuals have much shallower intraspecific histories (ie., 
show far less molecular genetic variation) than might have been expected 
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given their current abundance, suggesting that such species probably expe- 
rienced demographic contractions (and/or selective sweeps) in times past. 
Interestingly, molecular markers sometimes register an entirely different 
relationship between inferred historical population size and contemporary 
census size in species that today are rare or endangered. A case in point 
involves several cetacean species that in recent centuries have been the tar- 
gets of commercial whaling. 

Only about 10,000 humpback whales (Megaptera novaeangliae), 56,000 fin 
whales (Balaenoptera physalus), and 149,000 minke whales (B. acutorostrata) 
remain in the North Atlantic today. However, based on coalescent theory as 
applied to current observed standing crops of intraspecific mtDNA sequence 
diversity, Roman and Palumbi (2003) estimated that historical population 
sizes were much greater in these species: about 240,000 humpbacks, 360,000 
fin whales, and 265,000 minke whales. These estimates are far higher than tra- 
ditional ones (e.g., from whalers' logbooks), and they have generated consid- 
erable controversy (Lubick 2003). However, if these genetically based esti- 
mates of historical population size are valid even to a first approximation, sci- 
entists and whaling regulators may have to seriously revise their assumptions 
about the pre-exploitation state of large mammals in the oceans. 


Dispersal and gene flow 


Management strategies for natural populations can often benefit from knowl- 
edge about the movement patterns of organisms (and their gametes) from 
their natal sites and the magnitudes of realized gene flow (i.e., successful 
reproduction) that such dispersal entails. For example, an emerging discipline 
of "restoration genetics" (a synthesis of restoration ecology and population 
genetics) is concerned with issues such as the spatial extent of local adapta- 
tions, magnitudes of gene flow, and the potential risks of introducing foreign 
genotypes that might cause outbreeding depression or be unwise for other 
reasons (Hufford and Mazer 2003). Molecular markers can assist greatly in 
these analyses, and the findings often have conservation implications. 

One case in point involves marine protected areas, or MPAs (National 
Research Council 2000a). A sad truth is that negative human impacts, includ- 
ing pollution and overharvesting, have radically degraded the world's 
oceans. This situation has triggered calls for efforts to protect and restore 
marine ecosystems (Lubchenko et al. 2003; National Research Council 1999, 
2000a,b). The good news is that the establishment of local or regional "no- 
take" zones can be highly effective not only in increasing biodiversity and 
the abundances of many species within the boundaries of such reserves, but 
also in supplementing commercial and sport fisheries when the larvae of 
high-fecundity species naturally move from MPAs into nearby waters 
(Halpern 2003; Halpern and Warner 2003; Palumbi 2001). How large should 
MPAs be, and how should they be spaced? To a considerable degree, the 
answers depend on each species' dispersal and recruitment patterns, and in 
particular, on magnitudes of larval production and transport. 





Molecular Markers in Conservation Genetics 


Using molecular markers applied to many marine taxa, scientists are 
examining genetic patterns in the sea and beginning to interpret results in 
the context of MPA design. For example, Palumbi (2003) used a combination 
of genetic findings and computer simulations to deduce that many marine 
fishes and invertebrate species probably have larval dispersal distances of 
about 25-150 km, and that genetic divergence under isolation-by-distance 
models will often be most evident when the demes compared are separated 
by about 2-5 times the mean larval dispersal range. These results are pre- 
liminary, but they do offer a hint at the appropriate spatial scales of demo- 
graphic connectivity among MPAs (which may be distributed, for example, 
as "stepping stones" along a coast). 


Population Structure and Phylogeography 


Issues of kinship, dispersal, and contemporary gene flow grade into those of 
geographic population structure and intraspecific phylogeny. At these levels 
too, many applications of molecular markers have been employed in a con- 
servation context. For example, especially in agronomically important plants, 
much effort has been devoted to collecting and storing "seed banks" from 
which future genetic withdrawals may prove invaluable in developing need- 
ed strains (Frankel and Hawkes 1975). Similar proposals have been made for 
generating DNA banks for endangered animal species (Ryder et al. 2000). The 
genetic diversity of such collections can be maximized through knowledge of 
how natural variation is partitioned within and among populations, a task for 
which molecular genetic markers are well suited (Schoen and Brown 1991). In 
general, by revealing how genetic variation is partitioned within any plant or 
animal species, molecular methods can help to characterize the intraspecific 
genetic resources that conservation biology seeks to preserve. 


Genetics-demography connections 


In some cases, molecular genetic findings on extended kinship have clear 
and immediate relevance for conservation strategies. For example, mtDNA 
analyses have shown that different rookeries of the endangered green turtle 
often are characterized by distinctive maternal lineages, indicating a strong 
propensity for natal homing by females of this highly migratory species (see 
Chapter 6). Irrespective of the level of inter-rookery gene flow mediated by 
males and the mating system, this matrilineal structure of rookeries indi- 
cates that green turtle colonies should be considered demographically inde- 
pendent of one another at the present time, because any colony that might 
be extirpated would be unlikely to be reestablished naturally by females 
hatched elsewhere, at least over ecological time frames (Figure 9.1). This 
genetics-based deduction is consistent with the observation that many rook- 
eries exterminated by humans over the past four centuries (including those 
on Grand Cayman, Bermuda, and Alto Velo) have not yet been recolonized. 
So, the continuing decline of many sea turtle colonies through overharvest- 
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Figure9.1 Alternative scenarios for the genetics and demography of green turtle 
rookeries. Coded arrows indicate possible migration pathways of females between 
natal and feeding grounds. (A) In the diagram above the heavy horizontal line, 
females are assumed to home to natal sites, in which case rookeries would be inde- 
pendent of one another both genetically (with respect to mtDNA) and demograph- 
ically (with respect to female reproduction). (B) In the diagram below the horizon- 
tal line, females commonly move between rookeries, yielding genetic and demo- 
graphic non-independence. Molecular data from mtDNA are mostly consistent 
with scenario (A). (After Meylan et al. 1990.) 


ing is not likely to be ameliorated in the near term by natural recruitment 
from foreign rookeries, meaning that each remaining colony should be val- 
ued and protected individually (Bowen et al. 1992). 

This argument can be generalized and extended, as shown in Figure 9.2. 
When both sexes in any species disperse widely from their natal sites to 
reproduce (lower right quadrant of the figure), high gene flow and little pop- 
ulation genetic structure are anticipated in any class of neutral genetic mark- 
ers, perhaps implying also that local populations are demographically well- 
connected. Conversely, when both sexes are sedentary (upper left quadrant), 
low gene flow and strong population genetic structure should be registered 
in any suitable cytoplasmic or nuclear marker, implying considerable demo- 
graphic autonomy for each local population. When female dispersal is high 
and male dispersal low (upper right quadrant), a ^Y-linked" gene may or 
may not show pronounced population structure, depending on whether the 
vagabond females carry zygotes or unfertilized eggs. If the dispersing 
females carry haploid gametes which are then fertilized by local males, 
strong differentiation in Y-linked allele frequencies might be anticipated, and 
the populations also would tend to be demographically independent from 
one another with regard to male reproduction. However, if dispersing 
females carry zygotes or juveniles between populations (e.g., via pregnancy 
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Figure 9.2 Relationships between population genetic structure and gender-spe- 
cific dispersal and gene flow regimes. Population genetic structure is as might be 
registered in particular classes of neutral genetic markers with contrasting modes 
of inheritance. See the text for further explanation, especially for nuanced interpre- 
tations of the upper right quadrant where asterisks appear. (After Avise 1995.) 


or social interactions), they could maintain demographic ties (as well as 
avenues for the exchange of Y chromosomes) between geographic locales to 
which adult males are strictly fidelic. 

The most intriguing connection between population demography and 
matrilineal structure occurs when female dispersal is extremely low and male 
dispersal high (lower left quadrant in Figure 9.2). Then, as already mentioned, 
populations could be independent demographically even in the absence of 
significant spatial structure as registered in nuclear genes. In that case, gene 
flow estimates based solely on nuclear loci could, if taken at face value, pro- 
vide a grossly misleading base for management decisions requiring a demo- 
graphic perspective, such as how many distinct management units, or stocks, 
exist (see below), how they might respond to harvest, or how habitat corridors 
between refugia might couple otherwise separated populations. 

On the other hand, there are certain conditions under which inferred pat- 
tems of female dispersal and matrilineal gene flow (taken at face’ value) 
could also be misleading in demography-based population management. 
For example, if strong geographic matrilineal structure resulted solely from 
density-dependent restrictions on female movement, populations that are 
overexploited nonetheless might recover quickly by foreign recruitment as 
density-based impediments to female dispersal are relaxed. This situation 
might apply with special force to high-fecundity species such as many 
marine fishes and invertebrates, in which immigration of even a few gravid 
females might quickly replenish a local population. Another misleading sit- 
uation could involve species that conform to a source-sink demographic 
model (Pullium 1988), in which most geographic populations persist via con- 
tinued recruitment from reproductively favorable areas. Matrilines would 
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exhibit little spatial structure, suggesting at face value that local populations 
could recolonize quickly, yet extirpation of critical source populations could 
doom an entire regional assemblage. 

There are additional reasons for caution in relying exclusively on mito- 
chondrial (or other) molecular markers as phylogeographic guides to conser- 
vation strategies: the molecular assays themselves may fail to reveal some 
true population genetic structure; empirical patterns might be attributable to 
evolutionary forces other than historical gene flow and genetic drift (such as 
habitat-specific natural selection); and, perhaps most importantly, historical 
gene flow might have been high enough to homogenize frequencies of genet- 
ic lineages across locales (Nm >> 1), yet nonetheless far too low (in terms of 
absolute numbers of exchanged individuals per generation) to imply popula- 
tion demographic unity in any sense relevant to management. Thus, in assess- 
ing whether populations exhibit sufficient demographic autonomy to qualify 
as quasi-independent entities in contemporary time, field observations and 
experiments on animal dispersal will remain necessary because they comple- 
ment and extend the kinds of information on historical gene flow that often 
come from molecular markers. 


Inherited versus acquired markers 


In characterizing wildlife stocks—in fisheries management, for example—an 
important distinction is between heritable (genetic) and acquired features 
(Booke 1981; Ihssen et al. 1981). The latter include a variety of environmental- 
ly induced attributes often employed in population analysis, such as isotope 
ratios in tissues (Chamberlain et al. 2000; Hobson 1999; Marra et al. 1998), par- 
asite loads (Caine 1986; T. P. Quinn et al. 1987), or various phenotypic charac- 
ters, such as vertebral counts, that may be developmentally plastic and at least 
partially indicative of environmental exposures (Jockusch 1997). Acquired 
characters also include the physical tags, bands, or transmitters that 
researchers attach to animals to monitor their movements. 

Acquired markers unquestionably serve an important role in population 
analysis because they reveal where individuals have spent various portions of 
their lives. For example, a popular exercise in recent years has been to recon- 
struct where particular birds have migrated not only through banding stud- 
ies, but also by measuring the ratios of stable isotopes (notably those of car- 
bon or hydrogen) in their bodies (Kelly and Finch 1998). These isotopes tend 
to differ predictably with environmental factors such as latitude (Rubenstein 
et al. 2002). However, physical tags and other acquired characteristics are sel- 
dom transmitted across generations, so they do not necessarily illuminate the 
reproductive phenomena that can also be highly germane to population man- 
agement, nor can they reveal the principal sources of phylogeographic diver- 
sity within a species. By combining information from acquired characters 
with data from molecular genetic markers, more and different kinds of infor- 
mation about movement patterns are obtained than from either source of data 
considered alone (e.g., Clegg et al. 2003). 
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Figure 9.3 Four possible categories of relationship between magnitudes of 
apparent population structure as registered by distributions of inherited and of 
environmentally acquired markers. For categories (C) and (D) , it is assumed that 
the environmental markers were acquired during the oceanic and freshwater life 
stages, respectively (as indicated by the asterisks). 


In general, four possible relationships can be envisioned between the 
apparent population structures registered by genetic and by environmentally 
induced tags (Figure 9.3). These two classes of markers may reveal high and 
concordant structures, as, for example, when populations inhabit separate 
environments that induce different phenotypes in individuals and also pro- 
mote population genetic divergence. Such a concordant outcome might char- 
acterize a sedentary species, or perhaps one that is geographically strongly 
partitioned (such as many freshwater taxa) (Figure 9.3A). Alternatively, both 
genetic and acquired markers might reveal relative population homogeneity 
over broad areas, as in many vagile marine species (Figure 9.3B). 

However, two types of discordant outcomes are also possible. First, 
strong population structure might be evidenced by genetic markers despite 
an absence of population differences in acquired characteristics. Such could be 
the case in a natal-homing anadromous species assayed at the freshwater 
adult life stage, provided that the acquired characters were gained during the 
oceanic phase of the life cycle (Figure 9.3C). Conversely, significant geo- 
graphic structure in acquired characters might be evident despite an absence 
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of genetic differentiation, as, for example, in a randomly mating catadromous 
species sampled during the freshwater portion of its life cycle (provided that 
the acquired characters were incorporated during the freshwater phase) 
(Figure 9.3D). In other words, a true population genetic pattern would prob- 
ably not be registered accurately by any non-genetic marker acquired at a 
locality or life stage other than where reproduction occurs. 

These distinctions between genetic and acquired markers can be impor- 
tant in population management and conservation applications (Swain and 
Foote 1999). In the case of anadromous salmon originating in different 
rivers, for example, diagnostic population markers that happened to coin- 
cide with political boundaries (state or federal) could be ideal for equitably 
apportioning oceanic catches to fishermen from the relevant jurisdictions. 
For these purposes, any marker (genetic or otherwise) would do. Thus, nat- 
urally acquired stream-specific parasite loads, as well as dyes or tags artifi- —— ——— 
cially-applied-to-smoits (juvenile salmon in streams), would be perfectly 
suitable for these management purposes. However, the “populations” thus 
identified would have little or no evolutionary genetic significance if, for 
example, sufficient gene flow between stream populations occurs via lapses 
in natal homing. The point is that both inherited and acquired markers can 
find gainful employment for various management objectives, but only 
genetic markers have direct connection to population genetic and evolu- 
tionary issues. 








Mixed-stock assessment 


In fisheries management in particular, much attention has been devoted to 
"stock assessment" (Ovenden 1990; Ryman and Utter 1987; Shaklee 1983; 
Utter 1991), which can be viewed as a practical application of the principles 
and procedures of population structure analysis. Molecular markers can 
materially help in identifying such stocks (Shaklee and Bentzen 1998). 
Indeed, computerized databases now exist that summarize such information 
for fishes (Imsiridou et al. 2003). Molecular markers can also be invaluable in 
assessing the genetic composition of mixed stocks as a prelude, for example, 
to setting harvesting quotas or otherwise managing finite fish resources. 

In many commercial and sport fisheries, amalgamations of native and 
introduced (hatchery-produced or otherwise transplanted) stocks are har- 
vested. In several instances, molecular markers that distinguish genetic 
exotics from natives have been employed to monitor the fate ("success") of 
these introductions and to assess whether hybridization and introgression 
with native fish have taken place. The outcomes of such appraisals have var- 
ied. Among brown trout (Salmo trutta) in the Conwy River of North Wales, 
an introduction of fry from anadromous populations into landlocked popu- 
lations resulted in considerable hybridization between these two salmonid 
forms, as gauged by allozyme markers (Hauser et al. 1991). Thus, a stocking 
program designed to bolster catches of trout in landlocked bodies of water 
had come at the risk of introgressive loss of unique genetic characteristics in 
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the native landlocked form. For this same species in Spain, however, hatch- 
ery supplementation was a "failure": Protein electrophoretic analyses 
showed that genetically marked fry introduced from hatcheries into indige- 
nous populations failed to reach sexual maturity and apparently did not 
contribute to the pool of catchable and reproductive fish (Moran et al. 1991). 
Similar molecular studies of the Japanese ayu (Plecoglossus altivelis) likewise 
demonstrated that introduced stocks contributed little to reproduction in a 
native river population of this fish species (Pastene et al. 1991). 

Commercial fisheries may also involve exploitation of mixed native 
stocks, an example being the capture of anadromous salmon at sea. As 
alluded to above, a long-sought goal in salmon management has been to 
discover diagnostic markers that distinguish anadromous fish originating 
from different river drainages or regions (Beacham et al. 2000, 2001,-20022,—— — — — 

— — — ———2908;-Benntngliam et al. 1991; Potvin and Bernatchez 2001). Using such 
markers, the proportionate contribution of each breeding population to a 
mixed oceanic fishery could be determined, and harvesting strategies or 
capture allocations might then be tailored to varying population sizes or 
other attributes of the respective reproductive stocks. However, seldom 
have native salmon from nearby rivers displayed fixed allele frequency dif- 
ferences that would make such stock assignments unambiguous within a 
management jurisdiction (Stewart et al. 2003; but see Baker et al. 2003). 
Rather, observed genetic differences often involve mere shifts (albeit statis- 
tically significant ones) in population allele frequencies at multiple poly- 
morphic loci. Thus, even when particular source populations cluster togeth- 
er in terms of composite genetic distances (Figure 9.4), alleles typically are 
shared, such that precise genetic contributions from different stocks can be 
estimated only in a probabilistic or statistical framework (e.g., Millar 1987; 
Pella and Milner 1987; Xu et al. 1994). 

One example of such a statistical mixed-stock analysis in a conservation 
context involved the endangered loggerhead turtle, Caretta caretta. in the 
Mediterranean Sea, a longline fishery for swordfish incidentally captures an 
estimated 20,000 juvenile loggerheads per year, of which at least 20% perish 
(see Bowen and Avise 1996). To assess which breeding populations are affect- 
ed by this source of mortality, mtDNA cytochrome b sequences in longline cap- 
tures were compared with those found in North Atlantic and Meditérranean 
loggerhead rookeries (Bowen 1995; Laurent et al. 1993, 1998). Based on a max- 
imum likelihood estimate from a mixed-stock model, only about 50% of log- 
gerheads in the longline bycatch were derived from Mediterranean nesting 
beaches; most of the rest had come from the western Atlantic. 

Molecular markers have likewise been employed to track migratory 
movements of marine turtles in other contexts (Avise and Bowen 1994). 
Based on mtDNA assays, juvenile loggerheads taken near Charleston, South 
Carolina, were shown to have been hatched at various rookery sites as far 
south as Florida (Sears et al. 1995). Loggerheads collected in the Azores and 
Madeira (near Africa) proved to have originated from rookeries in the south- 
ern United States and Mexico (Bolten et al. 1998); and young specimens from 
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Figure 9.4 Genetic relationships among populations of pink salmon 
(Oncorhynchus gorbuscha) from 26 streams in Washington and British Columbia. 
Shown is a phenogram based on allele frequencies at more than 50 allozyme loci. 
Reasonably strong genetic clustering is evident for five geographic regions 
(although the Nooksack River, indicated by an asterisk, enters Puget Sound). 
However, this clustering was due primarily to differing frequencies of shared alle- 
les rather than fixed allelic differences. (After Shaklee et al. 1991.) 


Baja California proved to have migrated across the Pacific from Japanese 
rookeries (Bowen et al. 1995). For green turtles (Chelonia mydas) as well, 
mixed-stock analyses of mtDNA data have been used to quantify the relative 
contributions of various North Atlantic rookeries to two of that species' 
major feeding grounds: off the northeastern coast of Nicaragua (Bass et al. 


1998) and in the Bahamas (Figure 9.5). 
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Figure9.5 Genetic contributions of different green turtle rookeries (lettered cir- 
cles) to a feeding-ground population (starred circle) on Great Inagua, Bahamas. 
Numbers indicate maximum likelihood estimates of percent contributions of differ- 
ent rookeries to this feeding population, based on a mixed-stock model using 
observed mtDNA haplotypes. (After Lahanas et al. 1998.) 


Shallow versus deep population structures 


A general point about the recognition of population units in conservation 
biology involves the distinction between shallow (recent or contemporary) 
and deep (ancient) population genetic structure. In other words, "statistical- 
ly significant" population genetic structures are not all equal, but instead 
may be reflective of widely varying depths of evolutionary separation. 
Furthermore, populations of many species may be strongly isolated from 
one another at the present time (and at most time horizons in the past), but 
nonetheless remain tightly connected in a genealogical sense through recent 
or pulsed episodes of gene flow. Such considerations have given rise to a 
key distinction in conservation genetics between "management units" 
(MUs) and “evolutionarily significant units" (ESUs). 

The conceptual distinction between MUs and ESUs (Box 9.4) is nicely 
illustrated by population genetic findings on the green sea turtle. Nesting 
rookeries of this endangered species display evident but "shallow" mtDNA 
population structure within a given ocean basin, due in large part to natal 
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BOX 9.4 MUs and ESUs 





In many, if not most, species, organismal dispersal is far too low to promote 
appreciable demographic connections between geographic populations on a gen- 
eration-by-generation basis. Limited dispersal may be due to inherent low vagili- 
ty or to physical or ecological impediments to individual movements. With 
regard to population genetic evidence, the logic underlying the concept of a 
potential "management unit" (MU) is as follows: Any population that exchanges 
so few migrants with others as to be genetically distinct from them will normally 
also be independent of them demographically, at least at the present time. In the 
literature of commercial fisheries, MUs are often referred to as "stocks," toward 
which harvesting quotas or other management plans are directed (Avise 1987; 
Ovenden 1990; Ryman and Utter 1987). In principle, for any species, populations 
that are genuinely autonomous in contemporary tíme could qualify as distinct 
MUS (regardless of whether or not genetic markers may have helped to docu- 
ment such demographic autonomy). 

When (as is often the case) empirical field evidence on the magnitude of 
demographic connection between natural populations is limited or absent, MUs 
can nonetheless be identified provisionally by significant differences in allele fre- 
quencies at neutral marker loci. Mitochondrial haplotypes are especially power- 
ful for identifying potential MUs (Avise 1995; Moritz 1994) because of their typi- 
cally fourfold smaller effective population size (compared with haplotypes at 
autosomal loci), and because of the special relevance of matrilines to population 
demography. 

An “evolutionarily significant unit" (ESU), in principle, is one or a set of 
conspecific populations with a distinct long-term evolutionary history mostly 


homing by females (see Chapter 6). Thus, despite the recency of the genetic 
separations, most rookeries are demographically autonomous with regard 
to reproduction and, accordingly, each should qualify as an MU. The mag- 
nitude of genetic divergence between rookeries in the Atlantic versus Pacific 
Ocean basins is far greater (see Figure 6.5), apparently due to much longer- 
term barriers to gene flow. Thus, these regional assemblages of rookeries 
should qualify as distinct ESUs. Similar genetic architectures have been 
found in global molecular surveys of several other (but not all; Dutton et al. 
1999) species of marine turtles (Bowen and Avise 1996), and these findings 
are likewise highly informative in identifying provisional MUs and ESUs. 
Thus, both demographic and evolutionary separations are relevant to 
conservation efforts and management strategies (Avise 1987; Dizon et al. 
1992). Even shallow population genetic separations can be important 
because they indicate, for example, that magnitudes of contemporary recruit- 
ment from outside sources are probably insufficient to enable overexploited 
populations to recover quickly via natural immigration. Deep genetic sepa- 
rations within a species are also significant because they register major 
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separate from that of other such units (Ryder 1986). As such, ESUS are the pri- 
mary sources of historical (and perhaps adaptational; Fraser and Bernatchez 
2001) genetic diversity within a species and are thereby worthy of special consid- 
eration in conservation efforts (Avise 2000a; Bernatchez 1995; Moritz 1994). In 
practice, ESUs often conform closely to “intraspecific phylogroups,” as defined 
by Avise and Walker (1999). They might also be appropriately interpreted as vali- 
dated "subspecies" (the practical reason for not doing so being the likelihood of 
introducing taxonomic confusion: Most currently recognized subspecies were 
described fram limited information in the pre-molecular era, and might or might 
not qualify as bona fide ESUs upon more detailed analysis). 

Operational genetic criteria for recognizing ESUs have ranged from broad 
to detailed. A general suggestion is that ESUs contribute substantially to the 
overall genetic diversity within a species (Waples 1991b). A more explicit rec- 
ommendation is that ESUs be identified as groups of populations “reciprocally 
monophyletic for mtDNA alleles and also differ[ing] significantly for the fre- 
quency of alleles at nuclear loci" (Moritz 1994). Any such empirical suggestion 
is arbitrary to some extent because there cart be no clean line of demarcation 
along the continua of possible magnitudes of population genetic differentiation 
or temporal depths of population separation. Application of genealogical con- 
cordance principles (see Chapter 6) offers perhaps the best hope for critically 
evaluating evolutionary depths of population separation using molecular 
genetic markers. Notwithstanding the empirical challenges of genetically iden- 
tifying potential MUs and ESUs in particular instances (e.g., Bowen 1998; 
Dimmick et al. 1999; Paetkau 1999), the concepts themselves remain among the 
most important, if not revisionary, perspectives to have emerged from phylo- 
geographic appraisals of microevolution. 


sources of evolutionary genetic diversity that can be especially important to 
conservation's broader goal of biodiversity preservation. What follows are a ` 
few more examples of how both shallow and deep population structures 
within a species can inform conservation plans for rare or endangered taxa. 
Africa's highly endangered black rhinoceros (Diceros bicornis) and white 
rhinoceros (Ceratotherium simum) have been decimated by poachers who 
supply lucrative markets in ornamental dagger handles and supposed med- 
icines made from rhino horns. Even if poaching were eliminated, genetic 
and demographic problems in the few remaining populations could also 
threaten these species’ survival. Should all remaining conspecific rhinos be 
considered members of a single population for conservation purposes? Or 
should various recognized forms (including different taxonomic subspecies) 
be managed as separate population entities? The strategy of mixing and 
breeding conspecific rhinos from separate geographic sources might 
enhance each species' chance of survival by increasing effective population 
sizes and thereby forestalling inbreeding depression (and also reducing the 
risk of stochastic demographic extinctions). On the other hand, attempted 
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amalgamations of well-differentiated genetic forms might lead to outbreed- 
ing depression or cause an overall erosion of genetic variation, most of 
which might occur among (rather than within) geographic regions. 

To address such issues, several molecular studies have been conducted. 
Merenlender et al. (1989) reported only a small genetic distance (D = 0.005 
in allozyme comparisons) between two white rhino subspecies. Likewise, 
Swart and Ferguson (1997) found only very limited allozyme divergence 
between black rhinoceros subspecies, and Ashley et al. (1990) and O’Ryan et 
al. (1994) similarly found low overall mtDNA sequence divergence (p < 
0.5%). These genetic results suggested that outbreeding depression in cross- 
es between black rhino subspecies would probably not be a serious problem. 
A subsequent survey of the rapidly evolving mtDNA control region in two 
black rhino subspecies resulted in an estimate of about 2.6% sequence 
divergence, a value that the authors interpreted to indicate a separation time 
of perhaps a million years (Brown and Houlden 2000). Nonetheless, this 
sequence divergence in the mtDNA control region remains far lower than 
the directly comparable estimate of 14% sequence divergence between black 
rhinoceros and white rhinoceros. 

In a complex of endangered desert pupfishes (genus Cyprinodon) in and 
near Death Valley, California, early allozyme surveys revealed little poly- 
morphism within, but considerable genetic divergence among, most popu- 
lations, which currently are confined to isolated springs and streams that are 
remnants of inland lakes and interconnected watercourses that existed in 
former pluvial times (Turner 1974). A different pattern of geographic struc- 
ture within one of the desert pupfish species may be an exception that 
proves the rule. In C. macularius populations of the Salton Sea area of south- 
ern California, polymorphism within remnant colonies accounted for 7096 
of the total genetic variance, with differences among colonies contributing 
only 30% (Echelle et al. 1987; Turner 1983). The hydrologic history of this 
region suggests an explanation: These populations probably were in repeat- 
ed contact due to historical cycles of flooding of the lake basin, perhaps most 
recently in the early twentieth century when water broke out of the 
Colorado River irrigation system (Turner 1983). 

In general, fishes in desert basins of North America are declining at an 
alarming rate, with more than 20 taxa having gone extinct in the last few 
decades and many more at risk of the same fate. Meffe and Vrijenhoek (1988) 
considered management recommendations that might stem from the two dif- 
ferent types of population structure exhibited by the remaining species, and 
these scenarios should apply to other biological settings as well (Vrijenhoek 
1996). In the "Death Valley model" (patterned after the desert pupfishes), 
populations are small and isolated, such that most of the total genetic diver- 
sity within a species is likely to be partitioned among sites. In managing such 
populations, there is no need for concern about human-mediated interrup- 
tions of gene flow because natural genetic contact has been absent since the 
time when the ancestral watercourses desiccated. Indeed, precautions should 
be taken to avoid artificial gene flow among healthy populations that are nat- 
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urally isolated and are probably in the process of evolutionary diversification. 
The main concern should be maintenance of high N, within locales to allevi- 
ate both genetic and demographic dangers of small population size. 

Under an alternative "stream hierarchy model" (patterned after the 
endangered Poeciliopsis occidentalis and certain members of the Cyprinodon 
complex) populations in small dendritic water systems exhibit varying 
degrees of connection and gene flow, such that a larger fraction of total genet- 
ic diversity occurs within colonies. Meffe and Vrijenhoek (1988) suggested 
that management programs in such situations should be aimed at preserving 
the genetic integríty of each species while at the same time maintaining 
genetic variability. Thus, moving conspecific individuals among sites within 
a river basin should pose no special difficulties because such movement 
probably occurs naturally. However, precautions should be taken to avoid 
artificial movements between separate drainages, especially when this 
involves mixing different species that might compete or hybridize. 

Considerable attention has also been devoted to assessing molecular 
genetic variation within and among populations of endangered "big-rivex" 
desert fishes in the American Southwest (Dowling et al. 1996b; Garrigan et al. 
2002), and then devising management recommendations accordingly 
(Dowling et al. 1996b; Hedrick et al. 2000). In a bold and proactive conserva- 
tion plan based on genetic and demographic considerations for several 
species that inhabit the main stem of the Colorado River, Minckley et al. (2003) 
proposed that these native fishes should be bred and their progeny allowed 
to grow in isolated, protected, off-channel habitats from which competitor 
non-native fishes (otherwise abundant following human-mediated introduc- 
tions) are excluded. Panmictic adult populations (the source of brood stock for 
each native species) would continue to reside in the main channel of the river, 
but these would be supplemented each generation with individuals taken 
from the isolated off-channel habitats. With respect to genetics and demogra- 
phy, the intent of this program would be to significantly increase effective 
population sizes ín these critically endangered species while otherwise being 
faithful to their natural biology. 

Molecular analyses of population genetic structure have likewise been 
conducted on numerous endangered plants (see reviews in Hamrick and Godt 
1996; Rieseberg and Swenson 1996). To pick just one example, the meadow- 
foam Limnanthes floccosa californica is a geographically restricted annual that is 
endemic to vernal pools in Butte County, California. Protein electrophoretic 
analyses revealed that nearly all of the total genetic diversity (> 95%) was dis- 
tributed among rather than within the 11 known populations, such that esti- 
mates of inter-population genetic exchange were very low (Nm « 0.02; Dole 
and Sun 1992). Based on these findings, the authors recommended a conser- z 
vation plan that emphasizes preservation of as many of these MUs as possible. 

Conservationists' ultimate aim is to preserve biodiversity, an important 
currency of which is genetic diversity at a variety of levels (Ehrlich and 
Wilson 1991). A widely held sentiment is that added conservation value or 
worth should attach to groups of organisms that are genealogically highly 
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(rather than minimally) distinct from other such groups (Barrowclough 
1992; Faith 1994; Vane-Wright et al. 1991). At the intraspecific level, this 
means that ESUs (often identified now by molecular markers) should also 
be assigned special conservation importance. 

Consider the great apes, which have been the subject of several molec- 
ular surveys. In chimpanzees (Pan troglodytes), two highly divergent 
mtDNA lineages proved to distinguish a geographically disjunct population 
in western Africa from populations to the east (Goldberg and Ruvolo 1997; 
Morin et al. 1992, 1994a). Similarly, in gorillas (Gorilla gorilla), relatively deep 
genetic subdivisions between eastern and western forms in Africa have been 
reported (Garner and Ryder 1996; Ruvolo et al. 1994; Saltonstall et al. 1998). 
In orangutans (Pongo pygmaeus), marked genetic divergence between popu- 
lations in Sumatra and Borneo has led to suggestions that two taxonomic 
species be recognized (Warren et al. 2001; Xu and Arnason 1996; Zhi et al. 
1996). Table 9.3 describes similar examples of other rare or endangered taxa 
in which deep phylogeographic splits have been documented (typically in 
mtDNA sequences). 

Rather ancient genealogical subdivisions sometimes appear even in the 
most unlikely of situations. The African elephant (Loxodonta africana) is one 
of the most conspicuous and supposedly well known species on Earth, but 
recent DNA sequence analyses of four independent nuclear genes (Roca et 
al. 2001) uncovered a deep genealogical partition between forest-dwelling 
and savanna-dwelling forms (Figure 9.6). This general pattern was support- 
ed and extended by findings of salient genetic distinctions between these 
and other geographic populations of African elephants at microsatellite loci 
and in mtDNA sequences (Comstock et al. 2002; Eggert et al. 2002). These 
molecular data, interpreted together with morphological, ecological, and 
reproductive evidence, led to a recent taxonomic revision in which two 
African species are now recognized (L. africana and L. cyclotis). At least two 
major mtDNA clades also exist in the Asian elephant (Elephas maximus), but 
these currently are recognized as conspecific ESUs rather than species 
(Fernando et al. 2003b; Fleischer et al. 2001). These examples also illustrate 
how molecular studies of "intraspecific" phylogeography can grade into 
issues of species-level taxonomy relevant to conservation (see below). 


Lessons from intraspecific phylogeography 


Ehrlich (1992) lamented that time is running out for saving biological diver- 
sity: “The sort of intensive, species-focused research that I and my col- 
leagues have carried out [on checkerspot butterflies] ... appears to have a 
very limited future in conservation biology. Instead, if a substantial portion 
of remaining biodiversity is to be conserved, detailed studies of single 
species must be replaced with 'quick and dirty' methods of evaluating entire 
ecosystems, designing reserves to protect them, and determining whether 
those reserves are working." Willers (1992) went further: "To dwell endless- 
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Figure 9.6 Phylogenetic relationships of approximately 120 elephants sampled 
from multiple geographic locations and habitat types. (Relationships are based on 
nuclear DNA sequences.) (After Roca et al. 2001.) 
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TABLE9.3 Examples of rare or endangered species in which molecular markers 
have identified relatively deep phylogeographic separations between 
at least some populations that conventionally have been considered 


conspecific 
Threatened species 
Lasmigona sp. 
(bivalve mollusk) 


Cicindela dorsalis 
(tiger beetle) 


Latimeria chalumnae 
(coelacanth) 


Litoria pearsoniana 
(hylid frog) 


Lachesis muta 
(bushmaster snake) 


Gnypetoscincus queenslandiae 
(rainforest skink) 


Gopherus polyphemus 
(gopher tortoise) 


Xerobates agassizi 
(desert tortoise) 


Apteryx australis (kiwi bird) 
Strix occidentalis (spotted owl) 


Lanius ludovicianus 
(loggerhead shrike) 


Burramys parvus 
(pygmy possum) 


Dasyurus maculatus 
(marsupial tiger quoll) 


Perameles gunnii 
(barred bandicoot) 


Petrogale xanthopus 
(rock wallaby) 


Cervus eldi (Eld's deer) 
Lycaon pictus (wild dog) 


Eumetopias jubatus 
(Stellar sea lion) 


Trichechus manatus 
(West Indian manatee) 





Major phylogeographic distinction 


Northern versus southern populations in 
eastern U.S. 


Atlantic Coast versus Gulf of Mexico populations 
Indonesian versus African ocean waters 
Opposite sides of Brisbane River Valley, Australia 
Central versus South America 


Northern versus southern populations in 
Queensland 


Eastern versus western regions of southeastern U.S. 


East versus west of the Colorado River in 
western US. 


North Island versus South Island, New Zealand 


Northern versus southern populations in 
western U.S. 


San Clemente Island versus southern California 
mainland 

Northern, central, and southern Australian Alps 

Tasmania versus mainland Australia 

Mainland Australia versus Tasmania 


Southern Australia versus Queensland 


Three distinctive and disjunct units in south Asia 


Southern versus eastern Africa 


Gulf of Alaska northward versus southeast 
Alaska, Oregon 

Three or four major phylogeographic units 
in species range 





E 
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Reference 


No 


King et al. 1999 

Vogler and DeSalle 1993, 1994 
Holder et al. 1999 

McGuigan et al. 1998 
Zamudio and Greene 1997 
Cunningham and Moritz 1998 
Osentoski and Lamb 1995 
Lamb et al. 1989 


Baker et al. 1995 
Barrowclough et al. 1999; Haig et al. 2001 


Mundy et al. 1997 
Osborne et al. 2000 
Firestone et al. 1999 
Robinson 1995 
Pope et al. 1996 


Balakrishnan et al. 2003 
Girman et al. 1993; Roy et al. 1994 
Bickham et al. 1996 


Garcia-Rodriguez et al. 1998 


eee 
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ly on the tasks of obtaining more and ever more data for the expressed pur- 
pose of managing a biological reserve is to suggest that enough knowledge 
is just around the corner. This is not so." In a sense, these authors are cor- 
rect. The severe conservation challenges facing society cannot be solved 
solely by detailed genetic and ecological studies of particular taxa or 
regions. A Jack of political and social wil! to implement existing scientific 
understanding is a far greater impediment to conservation efforts than is a 
lack of detailed scientific information. 

On the other hand, there are important lessons to be gained from 
detailed phylogeographic studies of the sort described in this book. Most 
importantly, it has become abundantly clear that most species should not be 
viewed as undifferentiated monotypic entities for conservation (or other) 
purposes. Instead, they typically consist of geographically variable popula- 
tions with hierarchical and sometimes deep genealogical structures. From 
this recognition have come two rather unorthodox but fairly generalizable 
management guidelines for conserving intraspecific phylogeographic 
diversity. 


LIMIT UNNECESSARY TRANSPLANTATIONS. Most biologists recognize that 
introductions of exotic species can cause irreparable harm to regional biodi- 
versity by forcing extinctions of native species, but they have been slower to 
appreciate problems that can arise from transplanting and mixing well-dif- 
ferentiated genetic stocks within a species. Indeed, fish and game manage- 
ment agencies often sponsor active transplantations of organisms from one 
geographic region to another for purposes such as bolstering local popula- 
tion sizes, introducing "desirable" genetic traits into an area, or increasing 
local genetic heterozygosity. Unfortunately, undesirable consequences may 
also stem from such transplantations, including the possibility of disease or 
parasite spread, the irretrievable loss of the rich historical genetic records of 
populations, and the inevitable erosion of overall genetic diversity within a 
species (much of which was generated and maintained through historical 
geographic separations). Some transplantation programs may be justifiable, 
for example, when reintroducing a native species to a former range from 
which it has been extirpated by human activities. However, a developing 
perspective in conservation biology is that the burden of proof in any pro- 
posed transplantation program normally should rest on advocates of this 
strategy, rather than on those who would question the desirability of trans- 
plantations on the grounds cited above. 


DESIGN REGIONAL RESERVES. Comparative molecular analyses of region- 
al biotas have demonstrated that specific geographic areas (such as isolat- 
ed refugia of the Late Tertiary or Quaternary) have been evolutionary 
wellsprings for phylogeographic diversity within and among closely relat- 
ed taxa. In many cases, these phylogeographic sources of molecular biodi- 
versity were also recognized in traditional biogeographic appraisals of 
species' distributions (a conventional basis for defining biogeographic 
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provinces). These types of findings can motivate attempts to ensure the 


| safety and integrity of regional biodiversity hotspots. Such conservation 
efforts might include implementing strict guidelines to discourage trans- 
plantations between phylogeographic provinces. Ideally, they might even 

| entail designation of federally protected reserves, perhaps analogous to 
j National Parks (except that these "phylogeographic parks" would be 


focused on preserving biodiversity, rather than on preserving special geo- 
logical features of the landscape). Although no such regional perspective 
on phylogeographic diversity can hope to capture the idiosyncratic popu- 
lation structures and genetic subdivisions of each and every species, it can 
provide useful broad guidelines for management strategies, particularly 
as natural environments come under increased pressure and decisions on 
conservation prioritization become inevitable. 


Issues At and Beyond the Species Level 


Speciation and conservation biology 


Just as issues of close kinship grade into those of population structure and 
intraspecific phylogeny, so too do the latter grade into issues regarding the 
speciation process and associated taxonomic judgments. In the field of con- 
servation biology, these taxonomic issues often come into especially sharp 
focus in discussions of "endangered species" (Box 9.5). 


RECOGNITION OF ENDANGERED SPECIES. Most taxonomic assignments in 
use today (including formal designations of species and subspecies) were 
first proposed early in the twentieth century or before, often based on lim- 
ited phenotypic information and preliminary assessments of geographic 
variation. How adequately do these traditional taxonomies summarize true 
genetic biodiversity? For most groups, this question remains to be 
answered through continued systematic reappraisals, for which various 
molecular markers are ideally suited. The problems and challenges 
involved are far more important than a mere concern with nomenclature 
might at first imply. Taxonomic assignments inevitably shape our most fun- 
damental perceptions about how the biological world is organized. 
Ineluctably, formal names summarize the biotic units that are perceived 
and, therefore, discussed. In a conservation context, these perceived enti- 
ties provide the pool of candidate taxa from which are chosen particular 
populations toward which special management efforts may be directed 
(Soltis and Gitzendanner 1998). 

A "dusky seaside sparrow" by any other name is just as melanistic in its 
plumage, but without a taxonomic moniker this local population, formerly 
endemic to Brevard County, Florida, would probably not have been recog- 
nized as a biological unit worthy of special conservation attention. The 
dusky seaside sparrow was described in the late 1800s as a species 





“oT a AS, 


516 Chapter9 


BOX 9.5 Legal Protection for Species Under the ESA 
and CITES 





In 1966, passage of the Endangered Species Preservation Act initiated federal 
efforts in the United States to protect rare species from extinction. Three years 
later, attempts to remedy this Act's perceived deficiencies (e.g., lack of habitat 
protection) resulted in the Endangered Species Conservation Act of 1969. Again, 
shortcomings in the new act were recognized, and in 1973, Congress enacted a 
more comprehensive Endangered Species Act (henceforth ESA), which today 
remains the country’s strongest legal strategy for protection of rare species. The 
ESA’s stated intent was to “provide a means whereby the ecosystems upon 
which endangered species and threatened species depend may be conserved, 
[and] to provide a program for the conservation of such ... species." 

To qualify for ESA protection, a taxon must appear on an official list that is 
prepared and updated under the auspices of the U.S. Fish and Wildlife Service 
and the National Marine Fisheries Service. Listings may be made for a species, a 
subspecies, or a "distinct population segment," the latter sometimes being inter- 
preted as "a group of organisms that represents a segment of biological diversi- 
ty that shares an evolutionary lineage and contains the potential for a unique 
evolutionary future" (NRC 1995). A species is deemed to be endangered if it is 
at risk of extinction throughout all or a significant portion of its range, and to be 

- threatened if it is likely to become endangered in the foreseeable future. 
Utilitarian criteria for listing are that the plant or animal in question occur in 
numbers or habitats sufficiently depleted to critically threaten its survival. For 
example, grizzly bears in the lower 48 states have been listed as threatened, 
whereas the much larger Alaska population has not. As of 2003, the lists of 
threatehed and endangered species included more than 1,250 taxa within U.S. 
boundaries and another 500 or more taxa elsewhere. Interestingly, plant species 
somewhat outnumber animal species in these lists. 


(Ammodramus nigrescens) distinct from other seaside sparrows (A. mar- 
itimus), which were common along the North American Atlantic and Gulf 
coasts. Although the dusky was later demoted to formal subspecific status 
(A. m. nigrescens), its nomenclatural legacy prompted continued conserva- 
tion focus on this federally "endangered species" when, during the 1960s, 
the Brevard County population declined severely due to deterioration of its 
salt marsh habitat. In 1987, the last known dusky died in captivity after last- 
ditch efforts to save the population through captive breeding failed. 
Following the extinction of the dusky seaside sparrow, a molecular 
study of nearly the entire seaside sparrow complex (which includes nine 
conventionally recognized subspecies) produced a surprise (Avise and 
Nelson 1989; Nelson et al. 2000). In terms of mtDNA sequence, the dusky 
proved to have been essentially indistinguishable from other Atlantic coast 
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The ESA has been criticized on several grounds (Geist 1992; Rohif 1991), 
including its focus on preserving particular species (rather than ecosystems or 
biodiversity per se), its frequent emphasis on “charismatic megafauna" (such as 
many birds, large mammals, and showy plants, perhaps to the neglect of less 
conspicuous rare taxa), and the considerable time lag often involved in the for- 
mal listing process. On the other hand, many of these concerns reflect imple- 
mentation problems (insufficient funding, lack of political will, incomplete sci- 
entific knowledge, etc.) rather than faults in the legislation itself (O'Connell 
1992). In the future, the ESA's species-based approach might well be supple- 
mented with legislation designed explicitly to fulfill the broader goals of ecosys- 
tem maintenance and biodiversity protection. Nonetheless, at least as an interim 
measure, the ESA has been a revolutionary, far-sighted, and relatively effective 
piece of governmental legislation. 

CITES (the Convention on International Trade in Endangered Species), also 
adopted in 1973, has been signed by more than 125 countries. This treaty is 
another enlightened legal document that focuses on species of conservation con- 
cern. Its goal is to "regulate the complex wildlife trade by controlling species- 
specific trade levels on the basis of biological criteria." Prohibited under the 
CITES treaty are all commercial and most non-commercial trade in several hun- 
dred species that are listed in its oft-updated "Appendix I." Restricted but not 
entirely disallowed under CITES is commercial trade in many additional threat- ` 
ened species listed in its “Appendix II.” “Appendix HI” is a list of optional 
species that countries might wish to protect because these taxa could soon 
become endangered by trade. As of 2003, about-25,000 plant species and-5,000 
animal species were covered by CITES provisions, most of them appearing in 
Appendix II. 

Like ESA, the CITES treaty was a visionary. and helpful legal approach to 
conservation. Its main weaknesses involve inpiementaion difficulties in the . 
treaty's enforcement mechanisms. 


birds, but highly distinct in genealogy from birds nesting along the Gulf of 
Mexico coastline (Figure 9.7). This genetic split probably registers an 
ancient (Pleistocene) population separation, as further evidenced by a strik- 
ing similarity between the phylogeographic pattern of the seaside sparrow 
and those of several other (non-avian) estuarine taxa in the southeastern 
United States (see Chapter 6). Thus, the traditional taxonomy for the sea- 
side sparrow complex (upon which endangered species designations and 
management efforts were based) apparently had failed to capture the true 
phylogenetic (at least matrilineal) partitions within this taxonomic group. 
They had given special emphasis to a presumed biotic partition that was 
genealogically shallow or nonexistent, and they had failed to recognize a 
deep phylogeographic subdivision between most Atlantic and Gulf coast 
populations. 
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Figure9.7 Frequency histogram (mismatch distribution) of genetic distances 
among seaside sparrows. Shown are estimates of mtDNA sequence divergence 
between pairs of birds from various locales along the Atlantic and Gulf of Mexico 
coastlines of the southeastern United States. 


These molecular findings do not excuse or justify any poor land man- 
agement practices (chronicled by Walters 1992) that may have led to the 
dusky seaside sparrow's extinction, nor should publication of the molecular 
results be interpreted as heartlessness over this population's loss. The 
extinction of any natural population is regrettable, particularly in this era of 
rapid deterioration of natural habitats and biodiversity. However, the field 
of conservation biology finds itself in a difficult situation, with hard choices 
to be made about which taxa can be saved and how best to allocate limited 
resources among competing conservation demands. Especially in such cir- 
cumstances, management decisions should be based on the best available 
scientific information. 

Another example of a misleading taxonomy for an endangered species 
involved the colonial pocket gopher, Geomys colonus, endemic to Camden 
County, Georgia. This taxon was first described in 1898 on the basis of 
rather cursory descriptions of pelage and cranial characteristics. The pop- 
ulation remained essentially unnoticed and unstudied until the 1960s, 
when gophers in Camden County were "rediscovered." The population i 
referable to G. colonus then consisted of fewer than 100 individuals, and it 
was listed as an endangered species by the State of Georgia. A molecular 
genetic survey was subsequently conducted of allozymes, chromosome 
karyotypes, and mtDNA RFLPs. None of these genetic assays detected a 
consistent genetic distinction between G. colonus and geographically adja- 
cent populations of its presumed sister congener, G. pinetis (Laerm et al. 
1982). Results were not attributable to a lack of sensitivity in the tech- 
niques employed because each molecular method revealed dramatic 
genetic differences among a broader geographic array of G. pinetis popu- 
lations (particularly those in eastern versus western portions of the 
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species' range; see Figure 6.8). The conclusion in this case was clear: Extant 
gophers under the name of "G. colonus" did not warrant recognition as a 
distinct species. Either the description of G. colonus in 1898 was inappro- 
priate, or an originally valid (i.e., genetically distinct) species had gone 
extinct early in the twentieth century and was later replaced by G. pinetis 
immigrants into Camden County. 

Of course, in principle as well as in practice, no study can prove the 
null hypothesis that genetic differences between putative taxa are absent. 
One example of a "valid" species that proved to be nearly indistinguish- 
able in genetic composition from a close relative is the endangered pallid 
sturgeon (Scaphirhynchus albus). In assays of 37 monomorphic and poly- 
morphic allozyme loci, S. albus could not be genetically differentiated from 
its more common congener, the shovelnose sturgeon (S. platorynchus; 
Phelps and Allendorf 1983). However, pronounced morphological differ- 
ences between these taxa and their sympatric distribution implied that 
pallid and shovelnose sturgeons nonetheless qualify as a good biological 
species (the possibility of phenotypic plasticity was deemed unlikely; 
Phelps and Allendorf 1983). Subsequent assays of microsatellite loci did 
eventually confirm that pallid and shovelnose sturgeons are genetically 
distinguishable in sympatry (Tranah et al. 2001), but the broader message 
is that molecular data (especially when preliminary or based on only one 
class of markers) should not be interpreted in isolation in making taxo- 
nomic or conservation judgments, but instead should be integrated with 
other available lines of evidence. 

Molecular reappraisals of taxonomically suspect species may, of 
course, also bolster the rationale for special conservation efforts. One case 
in point involves the nearly extinct silvery minnow (Hybognathus amarus), 
which is endemic to the Rio Grande in the southwestern United States. This 
species has had a troubled taxonomic history, with some researchers view- 
ing it as a distinct species and other placing it in synonomy with H. nuchalis 
or H. placitus. However, based on a survey of 22 allozyme loci, Cook et al. 
(1992) observed several fixed allelic differences between these taxa, as well 
as overall levels of genetic distance (D > 0.10) somewhat greater than those 
normally distinguishing conspecific populations in other fish groups. The 
authors concluded that there was little justification for considering H. 
amarus conspecific with the other species with which it previously had been 
synonymized. 

Another case in point involves the highly endangered Kemp’s ridley 
turtle (Lepidochelys kempi), which nests almost exclusively at a single locale 
in the western Gulf of Mexico and has been the subject of the largest inter- 
national preservation effort for any marine turtle. The Kemp's ridley has a 
close relative, the olive ridley (L. olivacea), that is among the most abundant 
of marine turtles and has a nearly global distribution in warm seas. The | 
close morphological similarity between L. kempi and L. olivacea, and the | 
geographic distributions of these supposed sister taxa (which make little l 
sense under modern conditions of climate and geography; Carr 1967) | 
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Figure 9.8 Phylogeny for ridley and loggerhead marine turtles estimated from 
mtDNA data. Assayed olive ridleys from the Atlantic and Pacific oceans proved indis- 
tinguishable and, hence, are plotted here as a single unit. Also shown are the maxi- 
mum levels of mtDNA genetic distance observed among conspecific loggerhead tur- 
tles and conspecific green turtles from around the world. (After Bowen et al. 1991.) 


raised serious doubts about the evolutionary distinctiveness of L. kempi. 
Nonetheless, reappraisals of this taxonomic assemblage based on mtDNA 
assays indicated that L. kempi is more distinct from L. olivacea than assayed 
Atlantic versus Pacific populations of L. olivacea are from one another 
(Bowen et al. 1991, 1998). Furthermore, the levels of molecular differentia- 
tion between the Kemp’s and olive ridleys surpassed (slightly) those 
observed among any conspecific populations of the globally distributed 
green or loggerhead turtles (Figure 9.8). These findings leave little doubt 
that L. kempi warrants special taxonomic recognition. 

Neglected taxonomies for endangered forms can kill, as exemplified by 
studies of the tuataras of New Zealand. These impressive reptiles were 
treated by government and management authorities as belonging to a sin- 
gle species, despite molecular and morphological evidence for three dis- 
tinct (and taxonomically described) groups (Daugherty et al. 1990). Official 
neglect of this genetic diversity may unwittingly have consigned one form 
of tuatara (Sphenodon punctatus reischeki) to extinction, whereas another dis- 
tinctive form (S. guentheri) appears to have survived to the present only by 
good fortune. As stated by Daugherty et al. (1990), good taxonomies "are 
not irrelevant abstractions, but the essential foundations of conservation 
practice." 

In recent years, molecular genetic appraisals of endangered populations 
and taxa "at the species boundary" have become almost routine (Goldstein 
et al. 2000). Table 9.4 summarizes several additional examples in which 
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molecular markers have been informative with regard to characterizing 
genetic relationships in taxonomically problematic groups of special conser- 
vation concern. 


MOLECULAR FORENSICS AND LAW ENFORCEMENT. A common challenge in 
the enforcement of wildlife protection laws is to identify the biological 
source of blood, carcasses, meat, feathers, or commercial products from 
endangered or illegally harvested species. Various molecular markers have 
tremendous utility for such forensic purposes due to their species-diagnos- 
tic power. For example, Cronin et al. (1991c) compiled a list of diagnostic 
molecular characters (mtDNA digestion profiles in this case) for 22 species 
of large mammals that are frequent objects of illegal poaching. Similarly, 
Ross et al. (2003) compiled a list of mtDNA sequences for numerous whale 
and dolphin species in a Web-based "DNA surveillance" program, which | 
can be used as a standard reference against which to compare mtDNA 
sequences obtained from unknown cetacean material. Especially when used 
in conjunction with appropriate statistical assignment tests (e.g., Manel et al. 
2002), such molecular markers enable researchers and law enforcement 
agencies to characterize unknown biological material when obvious mor- 
phological characters are unavailable for analysis. 
An early application of molecular markers to law enforcement occurred 
in 1978, when a Japanese trawler fishing in United States coastal waters was 
suspected of illegally harvesting a rockfish species, Sebastes alutus. Tissues 
confiscated by enforcement officers did indeed prove upon protein elec- 
trophoretic examination to have come from that species, thereby contradict- 
ing claims in the trawler’s log that no such specimens had been taken (Utter 
1991). A similar case in Texas involved a suspected illegal sale of flathead 
catfish (Pylodictis olivaris). Protein electrophoretic analyses verified the 
species identity of frozen fish fillets and led to a fine levied against the sell- 
er (Harvey 1990). Molecular findings in wildlife forensics can also exonerate 
the falsely accused. In another case in Texas, electrophoretic analyses of con- 
fiscated fillets revealed that fishermen were innocent of suspected illegal 
possession of red drum (Sciaenops ocellatus) and spotted sea trout (Cynoscion 
nebulosus) (Harvey 1990). 
Some forensic applications involve distinguishing conspecific popula- 
tions. For example, a commercial catch of king crab (Paralithodes camtschati- 
ca) was claimed to have been harvested in a region of Alaska open to fish- 
ing, but it proved upon protein electrophoretic examination to have come 
from a closed area in the northwestern Bering Sea (Seeb et al. 1990). A more 
peculiar example of molecular forensics in a geographic context involved a 
bass-fishing tournament in Texas, in which a winning fisherman was sus- 
pected of having smuggled in a huge largemouth bass (Micropterus 
salmoides). Tissue samples from the trophy specimen were examined elec- 
trophoretically and shown to have come from a genetically distinct Florida 
subspecies (Philipp et al. 1981) that apparently had indeed been imported 
illegally (Harvey 1990). 
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TABLE 9.4. Examples ilfüstratíng the wide variety of taxonomically challenging groups of 
conservation concern for which diverse molecular approaches have been employed 
in genealogical appraisals 





Threatened taxonomic 





group Description 
Plants 
Acacia trees cpDNA sequences were used to assess phylogenetic relationships 
of rare and common species in Western Australia. 
Epipactis orchids Allozyme and cpDNA markers revealed population structures, 


Eucalyptus trees 
Zelkova trees 


Fungi 
Lentinula edodes 


Invertebrate animals 
Achatinella land snails 


Amblema freshwater 
mussels 


Cambarus cave crayfish 


Cicindela tiger beetles 


Crassostrea oysters 


Hemileuca buckmoths 


Lithasia aquatic snails 


Lucetta sponges 


Vertebrate animals 
Acipenseriforme fishes 


Charadrius plovers 


Dipsochelys tortoises 


breeding systems, and genetic relationships of three species. 
Allozymes helped reveal relationships among three taxonomically 
difficult species endemic to Tasmania. 
cpDNA sequences revealed strong genetic differentiation between 
two relict tree congeners in Sicily. 


rRNA sequences revealed four highly distinct genetic lineages 
that had been masquerading asa single morphotypic species. 


mtDNA sequences revealed several distinct evolutionary units of 
the endangered Hawaiian tree snail (A. mustelina). 

Allozymes and mtDNA were used to search for diagnostically 
distinct groups in these and related endangered forms for which 
morphological features are sparse. 

Allozyme data were used to estimate genetic variation and 
divergence in a difficult taxonomic complex of rare species in 
the Ozarks. 

Phylogenetic lineages and conservation units in these beach- 
dwelling insects were evaluated by mtDNA (including that 
isolated from nineteenth-century dried specimens). 

mtDNA and nDNA markers clarified phylogeography in a rare 
Portuguese oyster and its relationships to a cryptic congener. 

Small populations suspected to represent a distinct species proved 
indistinguishable from other populations in mtDNA and 
allozyme markers. 

mtDNA sequence data were used to reappraise phylogeny and 
taxonomy in about a dozen recognized taxa, many of which 
are imperiled. 

rDNA sequences revealed four primary, regionally restricted clades 
that help characterize World Heritage Areas in the western Pacific. 


Molecular sequence data were used to estimate phylogeny in 
several endangered species of sturgeon and paddlefish. 

Geographically disjunct breeding populations of endangered 
piping plover were shown to have been in extensive and recent 
genetic contact, as gauged by allozyme comparisons. 

Individuals suspected by morphology to be Seychelles tortoises, 
otherwise extinct, proved upon molecular inspection not to be 
genetically divergent from common extant tortoises from Aldabra. 


i 
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Reference 





Byrne et al. 2001 
Squirrel et al. 2002 
Turner et al. 2000 


Fineschi et al. 2002 


Hibbett and Donoghue 1996 


Holland and Hadfield 2002 


Mulvey et al. 1997, 1998 


Koppleman and Figg, 1995 


Goldstein and DeSaile 2003; Vogler 1994 


Huvet et al. 2000 


Legge et al. 1996 


Minton and Lydeard 2003 


Worheide et al. 2002 


Birstein et al. 1997; Krieger et al. 2000 


Haig and Oring 1988 


Palkovacs et al. 2003; see also Austin and Arnold 2002; Austin et al. 2003 


Dusicyon foxes A taxonomically problematic species (D. fulvipes) in Peru was i Yahnke et al. 1996 
shown to be genetically quite distinct from nearest congeners. (continued) 
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TABLE 9.4 (continued) Examples. illustrating the Wide variety of taxonomically challenging « groups 
COEM of conservation concern for which diverse molecular approaches have been employed 
_ingenealogical appraisals By? f 






Wrectulel taxonomic 





group Description 
Vertebrate animals 

Equus horses and zebras mtDNA sequences were used to clarify phylogenetic relationships 
among six species and eight subspecies of mostly rare equids. 

Eubalaena right whales mtDNA sequences revealed three distinct maternal evolutionary 
units in a taxonomic complex usually considered to contain 
two species. 

Gazella gazelles mtDNA sequence analyses called into question whether captive 
individuals of the otherwise extinct Saudi gazelle truly belong 
to this taxon. 

Icterus orioles mtDNA sequences were used to assess the phylogeographic status 
of several endangered oriole taxa in the Lesser Antilles. 

Lynx cats A critically endangered and probably extinct taxonomic species 
(L. pardinus) from Iberia proved to be highly distinct from other 
Lynx species. 

Melanotaenia rainbowfish A taxonomically problematic species (M. eachamensis) was shown 
to be genetically quite distinct from its nearest congeners. 

Panthera tigers Nuclear and mitochondrial markers were used to sort out genetic 
relationships among five recognized geographic subspecies. 

Petrogale rock wallabies Multifaceted genetic analyses of populations in this taxonomic 
complex identified a differentiated remnant form in Victoria. 

Polioptila gnatcatchers mtDNA analyses questioned some of the taxonomic designations 
upon which conservation plans for these birds were based. 

Scaphirhynchus sturgeon Genetic relationships involving three taxonomically problematic 


species were examined using mtDNA genotyping. 
Sternotherus freshwater turtles A taxonomically problematic species (5. depressus) was shown to 
be genetically quite distinct from its nearest congeners. 
Tympanocryptis "dragons" mtDNA data were used to assess phylogenetic and taxonomic 
issues in species and subspecies of Australian agamid reptiles. 


A global moratorium on commercial harvest of many cetacean species 
was established by the International Whaling Commission in 1985-1986, but 
whaling never completely stopped, and whale meat (known as kujira in 
Japan, gorae in Korea) continues to be sold in retail outlets, especially in east- 
em Asia and Scandinavia. One possibility is that this “whale meat" comes 
from small cetaceans, such as porpoises and dolphins, that still can be har- 
vested legally. Or perhaps the retail material is not from cetaceans, but instead 
is compressed fishmeal or another seafood substance that has been mislabeled 
to appear more exotic or pleasing to consumers. In the early 1990s, Baker and 
Palumbi (1994, 1996; Baker et al. 1996) began purchasing “whale products” 
from Asian outlets for mtDNA analysis. By comparing unknown samples 
against a reference database of cetacean mtDNA sequences, they were able to 
identify the species and sometimes even the geographic source of each sam- 
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Reference 


3 Oakenfull et al. 2000 


Rosenbaum et al. 2000 


Hammond et al. 2001 


Lovette et al. 1999 


Beltrán et al. 1996 


Zhu et al. 1998 


Cracraft et al. 1998 


4 RRR pa ma ids n 


Browning et al. 2001 
Zink et al. 2000 


Campton et al. 2000 


SR 


Walker et al. 1998c 


Scott and Keogh, 2000 


í 
; ple (Figure 9.9). About half of the retail products proved to have come from 
j whale or dolphin species that could plausibly have been harvested under 
legal permits, but some of the other retail samples came from endangered 
humpback and fin whales that had been killed illegally. 
Analogous research programs for the molecular forensic identification 
of wildlife products have since been instituted for several other taxonomic 
n groups that include threatened or endangered species (Table 9.5). In 1989, 
the U.S. Fish and Wildlife Service opened a wildlife forensics laboratory in 
Ashford, Oregon. The purpose of this first-of-its-kind "Scotland Yard for 
i animals" is to identify wildlife products such as those confiscated from 
1 poachers or illegal traders in wildlife products. This laboratory practices tra- 
ditional morphological forensics (based on "bones and feathers"), but much 
of its effort involves identification of unknown samples by protein and 





x 


526 Chapter 9 


at ait d 


Minke whale (Antarctic) 
Minke (Australia) 
Sample #19a 
Sample WS3 
Sample #9 
Sample #15 
Sample #29 
Sample #30 
Sample #36 
Sample #6 
Minke (North Atlantic) 
Sample #18 
Sample #19b 
Humpback (North Pacific) 
Humpback (North Atlantic) 
Gray 
Gray 
Blue (North Atlantic) 
Blue (North Pacific) 
Sample #41 
Sample #3 
Sample #11 
Sample WS4 
Fin (Mediterranean) 
Fin (Iceland) 
Sei (Iceland) 
—[ — Sei (North Atlantic) 

1—— —- Bryde's 

|-— Bowhead 
——— Bowhead 

—— Right 
o — Pygmy right 
Sperm 
Pygmy sperm whale 
Sample #16 
Harbor porpoise 
r——- Sample #13 

— Sample #28 
ce Hector's dolphin 
—7 Commerson’s dolphin 

Killer whale 
















D—— ee 













Figure9.9 Forensic identification of retail “whale meat" products. Shown is a 
molecular phylogeny (based on mtDNA control region sequences) for representa- 
tive cetacean species, with black circles indicating > 90% bootstrap support for an 
indicated clade. Note how this genetic analysis permitted assignment of each retail 
sample (numbered) to one or another cetacean species or lineage. (After Baker and 
Palumbi 1994; Frankham et al. 2002.) 
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DNA evidence. The magnitude of the task is daunting. Unlike the several 
hundred police crime labs in the United States, which deal with a single 
species (Homo sapiens), the Ashland workers (plus a small cadre of like- 
minded university researchers) must cope with molecular diagnostics in the 
entire remainder of the biological world. 


| 
| 
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| 
1 Hybridization and introgression 
f 


In the context of rare and endangered species, instances of introgressive 
hybridization, as sometimes documented using molecular markers, have 
raised both biological and legal issues. 


BIOLOGICAL ISSUES. One evolutionary threat to rare species is genetic 
swamping through extensive hybridization with related taxa (Levin et al. 
1996; Rhymer and Simberloff 1996). An empirical example involves the 
nearly extinct mahogany tree Cercocarpus traskiae, endemic to Santa 
Catalina Island in Los Angeles County, California. Protein electrophoretic, 
RAPD, and morphological appraisals of about a dozen remaining adult 
trees revealed that about one-half of the individuals were products of 
hybridization between C. traskiae and other Cercocarpus species that were 
more abundant on the island (Rieseberg and Gerber 1995; Rieseberg et al. 
1989). These genetic discoveries led to two management suggestions 
intended to lower the probability of further hybridization (Rieseberg et al. 
1989): Eliminate individuals of other Cercocarpus species from near the 
temaining pure C. traskiae specimens, and transplant seedlings and estab- 
lished cuttings from non-hybrid individuals to more remote areas on Santa 
Catalina Island. 

In another endangered plant—the yellow larkspur (Delphinium luteum), 
whose range is restricted to Bodega Bay, California—questions arose as to 
whether this localized species might itself be a product of interspecific 
hybridization. To test the hypothesis that D. luteum arose from crosses 
between two common congeners (D. decorum and D. nudicaule), Koontz et al. 
(2001) used allozyme and RAPD techniques. Diagnostic markers for the can- 
didate parental taxa proved not to be additive in D. luteum, indicating that 
this endangered species was probably not of recent hybrid origin. 

A probable example of genetic swamping in an animal species involves 
the declining New Zealand gray duck (Anas superciliosa), which has been 
severely affected by hybridization with introduced mallard ducks (A. 
platyrhynchos) (Rhymer et al. 1994). Near the other end of the continuum, 
hybridization may be confined to production of first-generation hybrids. An 
example involves the bull trout (Salvelinus confluentus), a federally threat- 
ened species that occasionally hybridizes with introduced brook trout (S. 
fontinalis) in the western United States. Analyses using allozyme and 
mtDNA markers revealed that hybrids beyond the F, generation were pres- 
ent, but quite rare (Kanda et al. 2000b; Leary et al. 1993). In such cases, any 
detrimental effects of hybridization may mostly entail wasted reproductive 
effort, or perhaps negative social or ecological effects of hybrid animals, 
rather than genetic introgression per se. 

Cutthroat trout (Oncorhynchus clarki) native to the western United 
States and Canada comprise an assemblage of approximately 14 recognized 
subspecies, many of which are threatened by human habitat alterations and 
artificial introductions of non-native trout species (Behnke 2002). Most of 
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TABLE 9.5 Examples of forensic applications of PCR-based DNA assays employed to 
identify the biolagical source of the material indicated 















Taxonomic group 
Wildlife product of concern 
Caviar (fish eggs) Sturgeon (Acipenser, Huso) 
Body parts Sharks 
Penises Pinnipeds (seals and allies) 
“Turtle meat” Large freshwater turtles (especially Macroclemys) 
Scrimshaw Sperm whales 
Feces and hair Chinese tiger (Panthera tigris amoyensis) 
Carcasses Deer 
Infants Chimpanzee (Pan troglodytes) 





the subspecies have protected legal status, and two already may be extinct. 
Surveys of allozymes and mtDNA in scores of cutthroat populations have 
revealed a complex phylogeographic pattern, with some subspecies almost 
indistinguishable genetically and others as distinct as normal congeneric 
species (see Allendorf and Leary 1988, and references therein). Molecular 
markers have also documented extensive hybridization between trans- 
planted cutthroat trout subspecies as well as between native cutthroat and 
introduced rainbow trout (O. mykiss) (Weigel et al. 2002, 2003). For exam- 
ple, introgression from rainbow trout was observed in 7 of 39 assayed pop- 
ulations in Utah (Martin et al. 1985); and in Montana, more than 30 of 80 
populations formerly thought to be pure “westslope” cutthroat trout 
proved upon molecular examination to include products of hybridization 
with either rainbow trout or the “Yellowstone” cutthroat form. In the 
Flathead River drainage in Montana (considered one of the last remaining 
strongholds of native westslope cutthroat trout), only 2 of 19 headwater 
lakes sampled contained pure populations, and detailed genetic analyses 
further revealed that the hybridized headwater populations were “leaking” 
foreign genes into downstream areas (Allendorf and Leary 1988). 

Such findings on the introgressed structure of particular cutthroat popu- 
lations led to two conservation-related concerns (Allendorf and Leary 1988). 
First, hybrids between genetically differentiated trout often exhibit reduced 
fitness due to developmental abnormalities (Allendorf and Waples 1996). 
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Description Reference 
3 Food samples in retail outlets examined for species Birstein et al. 1998; 
of origin; about 25% were mislabeled as to species. DeSalle and Birstein 
1996 
Identifications made possible for species in worldwide Shivji et al. 2002 
pelagic fisheries. 


These supposed aphrodisiac samples tested for species Malik et al. 1997 
of origin; many samples proved not to be from 
pinnipeds. 
Food samples tested for species of origin; about 25% Roman and Bowen 2000 
actually were from alligator. 
Molecular methods developed to extract DNA samples Pichler et al. 2001 
from teeth and bones. 


Unknown fecal samples genotyped and compared Wan et al. 2003 
against known references to confirm presence of 
tigers in China. 

DNA fingerprinting used to identify remains of Fang and Wan 2002 
protected cervid species. 

The geographic origin of orphaned or “refugee” Goldberg 1997 


animals was determined. 





Second, extensive introgressive hybridization carries the danger of genetic 
swamping and loss of locally adapted populations. As stated by Allendorf 
and Leary (1988), “The eventual outcome of widespread introgression and 
continued introduction of hatchery rainbow trout is the homogenization of 
western North American trout into a single taxon. Thus, we would exchange 
all of the diversity within and between many separate lineages, produced by 
millions of years of evolution ... for a single new mongrel species.” 

In the Canidae (dogs and allies), several studies have used molecular 
markers to address whether endangered wild species hybridize occasional- 
ly among themselves, or perhaps with domestic canines. In the southeastern 
United States, mtDNA genotypes normally characteristic of domestic dogs 
(Canis familiaris) have been reported in some coyotes (C. latrans) (Adams et 
al. 2003). In northeastern North America, a unidirectional introgression of 
coyote mtDNA into some gray wolf (C. lupus) populations may have 
occurred following hybridization between gray wolf males and coyote 
females (Figure 9.10). In eastern Europe, some gray wolves display mtDNA 
genotypes characteristic of domestic dogs, again probably as a result of 
hybrid matings involving wolf males (Randi et al. 2000; but see also Vila and 
Wayne 1999). In Africa, molecular markers indicate that one population of 
the world’s most endangered canid—the Ethiopian wolf (C. simensis)—like- 
wise contains genes derived from hybridization with domestic dogs 
(Gottelli et al. 1994). 
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Figure 9.10 Phylogeny for mtDNA genotypes observed in gray wolves and coy- 
otes from North America. Note that whereas several genotypes observed in 
wolves (W) group separately from those of coyotes (C), others (indicated by aster- 
isks) do not, the exceptions perhaps being attributable to recent introgressive 
hybridization between these species. Additional data from mtDNA (Wayne et al. 
1992) and microsatellites (Roy et al. 1994) were interpreted as further support for 


this possibility (Wayne 1996). (After Lehman et al. 1991.) 


One endangered canid has even been suggested to be a product of past 


hybridization and introgression. The red wolf (C. rufus) formerly ranged 
throughout the southeastern United States, but declined precipitously after 
1900 and became extinct in the wild in the mid-1970s. Molecular analyses of 
remaining captive animals (as well as museum-preserved skins and blood 
samples from deceased wild specimens) revealed that extant red wolves are 
genetically similar to gray wolves, but also contain some alleles possibly 
derived from hybridization with coyotes (Roy et al. 1996; Wayne 1992; 
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Wayne and Jenks 1991). Whether this hypothesized introgression occurred 
i in ancient times or recently, following an eastward range expansion of coy- 
otes, is unclear (Nowak and Federoff 1998). Indeed, altogether different 
interpretations of these data have been advanced (Dowling et al. 1992; 
Ferrell et al. 1980; Nowak 1992). For example, P. J. Wilson et al. (2000) report- 
ed that red wolves are genetically most similar to small-bodied eastern 
Canadian wolves, with which they might share common ancestry inde- 
pendent of any hybridization with other canids. These uncertainties have 
produced intense debates over the taxonomy of C. rufus (Phillips and Henry 
1992) as well as the advisability of management programs that include 
restoration of a viable wild population in North Carolina (Gittleman and 
Pimm 1991; Nowak and Federoff 1998). 

Based on molecular and other evidence, past or current hybridization 
between fish species has contributed to the genetic composition of several 
recognized taxa, including some that are considered threatened or endan- 
gered (de Marais et al. 1992; Gerber et al. 2001; Meagher and Dowling 1991; 
see review in Dowling and Secor 1997). The same can be said of a number 
of plant species (see review in Rieseberg 1997). Thus, rather than being 
merely an erosive force that diminishes or swamps preexisting genetic 
diversity by blurring species’ distinctions, introgressive hybridization can 
sometimes also be viewed as a highly creative evolutionary force (Arnold 
1997). Under this view, hybridization can spawn new genotypic diversity, 
move genetic adaptations between species, and sometimes even generate 
new species that represent stabilized recombinant lineages. 

Another biological context in which hybridization might be a good — 
thing for conservation efforts is exemplified by attempts to save the endan- 
gered Florida panther (Felis concolor coryi), a subspecies of cougar (also 
known as puma or mountain lion) from the Florida Everglades. Despite pro- 
tection from hunting since the late 1960s, the population has continued to 
decline, due in part to overt genetic defects (including increased frequencies 
of undescended testicles and defective sperm in males) attendant on intense 

inbreeding in this small population of only 60-70 animals (O’Brien et al. 
| 1990, 1996). In a theoretical analysis addressing population genetic aspects 
of the Florida panther's imperiled condition, Hedrick (1995) concluded that 
a genetic restoration of the population could be achieved by translocating 
Texas cougars in such a way as to promote about 20% gene flow into the 
Florida population in an initial generation and about 2%—4% in generations 
thereafter. In 1995, an introduction program was begun with the release into 
southwestern Florida of eight Texas cougars, which have since been 
hybridizing successfully with the native animals (Land and Lacey 2000; 
Maehr et al. 2002). 
: In summarizing the consequences of hybridization and introgression 
i for conservation biology, Allendorf et al. (2001) concluded that these phe- 
nomena have contributed to the extinction of many species (directly or 
j indirectly), but have also played important creative roles in the evolution 
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of many plant and animal taxa. They also concluded that "any policy that 
deals with hybrids must be flexible and must recognize that nearly every 
situation involving hybridization is different enough that general rules are 
not likely to be effective." 


LEGAL ISSUES. For several years, the "Hybrid Policy" of the Endangered 
Species Act (Box 9.6) was a flashpoint for legal controversies surrounding 
the biological topic of hybridization and introgression in conservation pro- 
grams. This policy, which initially denied formal protection to organisms of 
hybrid ancestry, served as a basis for challenging several existing endan- 
gered species designations and associated management programs. The sit- 
uation of the red wolf, described above, provides one example. Another 
involves the gray wolf, which, as mentioned, also appears to have 
hybridized on occasion with the coyote. When interpreted against the 
philosophical platform of the original Hybrid Policy, these genetic findings 
prompted at least one petition to the Interior Department to remove the 
gray wolf from the list of endangered species in the northern United States. 
[This petition was then denied by the U.S. Fish and Wildlife Service (Fergus 
1991).] Clearly, for several reasons, including those mentioned in Box 9.6, 
the mere documentation of hybridization involving an endangered popu- 
lation should not be sufficient grounds for removing an endangered or 
threatened species from the protection rosters. 
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Species phylogenies and macroevolution 


PHYLOGENETIC APPRAISALS. Phylogenetic considerations at the species 
level and above can also be of relevance to conservation biology. Such inter- 
est may be partly academic. For example, the ancestry and geographic ori- 
gin of the endangered Hawaiian goose (Nesochen sandvicensis) have long 4 
been intriguing. Based on mtDNA analyses of living and fossil material, it 
now appears that the morphologically distinctive Hawaiian goose is allied 
more closely to the Canada goose (Branta canadensis) than to the black brant 
(Branta bernicla) or emperor goose (Chen canagica), two other candidate 
species (Paxinos et al. 2002a,b; Quinn et al. 1991). From this molecular evi- 
dence on maternal lineages, it was concluded that the Hawaiian goose's 
ancestors colonized the islands from North America within at most the last i 
one million years. t 
North America's black-footed ferret (Mustela nigripes) is a highly endan- 
gered member of the Mustelidae (weasel and skunk family). An abundant j 
species with a broad distribution a century ago, it was decimated primarily i 
through human eradication of its principal prey base and associate: prairie i 
dogs (Cynomys spp.). In 1981, a few remaining specimens of M. nigripes were i 
discovered in Wyoming, and molecular analyses (allozymes and microsatel- i 
lites) showed that this deme had extremely low genetic variability because of 
severe population bottlenecks (see Table 9.1). The black-footed ferret has more 
common relatives elsewhere, however, including the steppe polecat (M. evers- 
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BOX 9.6 Hybrid Policies under the Endangered 
Species Act 





In a series of official opinions issued by the Solicitor's Office of the U.S. 
Department of the Interior beginning in 1977, it was concluded that natural or 
artificial hybrids between endangered species, subspecies, or populations 
should not receive protection under the Endangered Species Act (ESA; see Box 
9.5). The rationale was that even if these hybrids were themselves to breed, 
they would not produce purebred offspring of either parental taxon and, 
hence, would not promote the purposes of the ESA. 

These decisions (which became known as the "Hybrid Policy") had seri- 
ous ramifications. For example, they prompted formal petitions from some 
land use constituencies to remove certain protected taxa from the endangered 
species list on the grounds that introgression had compromised the genetic 
integrity of the listed forms. However, as noted by Grant and.Grant (1992), 
many, if not most, species of plants and animals hybridize at least occasionally 
in nature, so if hybridizing species fall outside the limits of protection afford- 
ed by the ESA, few candidate taxa will ever qualify for protection; and "if rar- 
ity increases the chances of interbreeding with a related species, presumably 
because conspecific mates are scarce, then the species most in need of protec- 
tion, by virtue of their rarity, are the ones most likely to lose it under current 
practice, by hybridizing." D j 

O’Brien and Mayr (1991) attacked the Hybrid Policy on additional 
grounds by claiming that definitional and other operational difficulties had 
produced "confusion, conflict, and ... certain misinterpretations of the 
[Endangered Species] Act by well-intentioned government officials." They 
also concluded that whereas management programs that promote hybridiza- 
tion between distinct species should normally be discouraged, the Hybrid 
Policy should not be applied to native subspecies or populations because the 
latter retain a potential to interbreed as part of ongoing ecological and evolu- 
tionary processes in nature. 

Such criticisms from biologists prompted a withdrawal memorandum 
from the Solicitors Office, stating that "the rigid standards set out in those 
previous opinions fof the Hybrid Policy] should be revisited" and that "the 
issue of ‘hybrids’ is more properly a biological issue than a legal one." 
Consequently, the U.S. Fish and Wildlife Service now recognizes that limited 
amounts of introgression do not automatically disqualify individuals from 
"species membership," nor do they necessarily preclude partially introgressed 
populations from being afforded legal protection under the ESA. 


manni) of Siberia. O’Brien et al. (1989) employed allozyme assays to assess the 
phylogenetic position of M. nigripes within Mustelidae (Figure 9.11). These 
data confirmed that M. eversmanni and M. nigripes are sister taxa, and also 
showed that they differ genetically (D = 0.08) by about as much as do closely 
related congeners in many other mammalian groups (see Figure 1.2). These 
two species probably separated about 0.5—2.0 mya (O'Brien et al. 1989). 
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Figure9.11 Phylogenetic position of the endangered black-footed ferret 
(Mustela nigripes) within the Mustelidae, based on allozyme comparisons. Other 
species assayed were the Siberian steppe polecat (M. eversmanni), European polecat 
(M. putorius), mink (M. vison), striped skunk (Mephitis mephitis), spotted skunk 
(Spilogale putorius), African striped skunk (Ictonyx striatus), and, as an outgroup, the 
American black bear (Ursus americanus, Ursidae). (After O’Brien et al. 1989.) 


All seven to eight species of marine turtles are considered either threat- 
ened or endangered. Their taxonomy and systematics have been controver- 
sial due to phylogenetic uncertainties at levels ranging from population dis- 
tinctions to deeper evolutionary alliances among species, and some of these 
uncertainties have had conservation consequences. Examples involving the 
green and ridley turtles already have been described. Other challenging 
questions include the following: Is the eastern Pacific black turtle (Chelonia 
agassizi) specifically distinct from the green turtle (C. mydas)? Is the spongiv- 
orous hawksbill (Eretmochelys imbricata) allied phylogenetically to herbivo- 
rous green turtles or to carnivorous loggerheads (Caretta caretta)? Is the 
Australian flatback (Natator depressa) allied closely to the greens (as its earli- 
er placement within Chelonia suggested), or is it a relative of the loggerheads? 
Molecular markers have provisionally answered such questions (see review 
in Bowen and Avise 1996). For example, a phylogeny estimated from mtDNA 
sequences (Figure 9.12) suggested that black turtles fall well within the range 
of genetic differentiation exhibited among green turtle rookeries worldwide; 
that the hawksbill is more closely related to the loggerhead complex than to 
green turtles and hence probably evolved from a carnivorous rather than a 
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Figure9.12 Phylogeny for all recognized species of marine turtles (plus the 
freshwater snapping turtle C. serpentina as an outgroup) estimated using sequence 
data from the mtDNA cytochrome b gene. For species represented by more than 
one sample, the individuals came from different oceanic basins. Exact orders of the 
nodes within each shaded box are uncertain, differing slightly with alternative 
methods of data analysis. (After Bowen et al. 1993a.) 


herbivorous ancestor; and that the flatback turtle is highly distinct from both 
the loggerhead and green complexes and is roughly equidistant from both. 

Such phylogenetic appraisals of endangered (or other) assemblages are 
of relevance to conservation biology in the general sense that they provide 
an understanding of evolutionary relationships among species and higher 
taxa. Should their ramifications be extended by using phylogenetic position 
as a criterion for prioritizing taxa with regard to conservation value? 


PHYLOGENETICS AND CONSERVATION PRIORITIES. If all extant species were 
non-threatened, or if resources available for conservation were unlimited, 
there would be little need to rank taxa for conservation value. However, all 
ongoing or contemplated conservation efforts involve establishing priori- 
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ties. In the broadest sense, any of three underlying rationales justify human 
efforts to protect biodiversity: aesthetic considerations, the realization that 
living species provide utilitarian services (Balmford et al. 2002), and an eth- 
ical stance that attributes an intrinsic value to life (Crozier 1997; Nixon and 
Wheeler 1992). In actual management practices, however, several more 
proximate ranking criteria routinely come to the fore (explicitly or implic- 
itly): rarity, restricted distribution, perceived ecological importance, 
“charisma” [for example, Clark and May (2002) confirmed that large, 
attractive, or emotive species attract more attention than small or drab 
species, thus producing a bias toward "charismatic megabiota" in conser- 
vation programs], economics, management feasibility, and phylogenetic 
distinctiveness. 

This last criterion, which can be informed by molecular genetic findings, 
warrants elaboration. Implicit in the writings of many biologists is the 
notion that evolutionarily distinct taxa contribute disproportionately to 
overall biodiversity and thus should be prioritized for conservation efforts. 
For example, the tuataras (Sphenodon) of New Zealand might be deemed to 
be of exceptional conservation value because they are the sole living mem- 
bers of a very ancient reptilian family (Daugherty et al. 1990). On the other 
hand, Erwin (1991) suggested the opposite: that such "living fossils" are 
likely to be evolutionary dead ends, and that members of rapidly speciating 
clades should be valued more highly by virtue of their greater potential for 
generating future biodiversity. 

May (1990) and Vane-Wright et al. (1991) were among the first to artic- 
ulate the idea that a taxon's phylogenetic distinctiveness could be quanti- 
fied and used expressly in priority rankings for conservation. Their pro- 
posal was soon refined and elaborated by many workers (Barrowclough 
1992; Crozier 1992; Faith 1992, 1993; Williams et al. 1991; see reviews in 
Crozier 1997; Humphries et al. 1995; Krajewski 1994; May 1994). A key ele- 
ment in such phylogenetic ranking is the concept of "independent evolu- 
tionary history" (IEH), relative magnitudes of which are quantitatively 
assessed as branch lengths in phylograms. In any phylogram (estimated, 
for example, from molecular data), total IEH is the summed length of all 
tree branches. When several species within a group are to be rank-ordered 
for conservation value, the relevant branch lengths are appropriately dis- 
counted for branch segments shared with other extant taxa (May 1994). The 
typical argument is as follows: If conservationists could save only some 
fraction of living species in a given phylogram, the optimal choice would 
maximize the sum of independent branch lengths (each counted only once) 
to be preserved. In practice, this normally means that higher conservation 
priorities would be given to extant forms that lack close living relatives 
(such as tuataras), because such forms have had long independent evolu- 
tionary histories. 

Molecular data are well suited for estimating branch lengths as well as 
nodal placements in phylograms. Furthermore, to the extent that various 
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DNA sequences evolve in clocklike fashion (or that non-clocklike behavior 
is accommodated in the phylogenetic analysis), these branch lengths may 
also be interpreted as estimates of evolutionary times since shared ancestry. 
Normally, species to be ranked for conservation value by IEH criteria 
would belong to a specific taxonomic assemblage, but in principle, the 
approach could also be used to rank-order species across disparate phylo- 
genetic arrays, such as particular mammals versus particular fishes or par- 
ticular arthropods. 

Figure 9.13 presents empirical phylograms for bears (Ursidae), cats 
(Felidae), marine turtles (Cheloniidae plus Dermochelyidae), and horse- 
shoe crabs (Limuloidea), all based on DNA sequence comparisons. 
Suppose that within each of these taxonomic assemblages, three or four 
candidate species are to be rank-ordered for preservation according to each 
of five ranking criteria mentioned above: rarity, xestricted distribution, eco- 
logical significance, charisma, and phylogenetic distinctiveness (using the 
IEH metric) Assume further that for each taxonomic group, available 
resources permit only one species to receive conservation attention. Which 
one should it be? 

Several points emerge from Figure 9.13. First, different criteria often 
rank the same species differently. Among marine turtles, for example, rari- 
ty and limited range would demand high conservation priority for Kemp's 
ridley, whereas the green turtle probably plays a greater ecological role in 
nature (it is an important marine herbivore), and the magnificent 
leatherback turtle (Dermochelys coriacea) probably has the most charisma 
and is certainly the most phylogenetically distinctive member of the group. 
Among felids, the tiger (Panthera tigris) probably ranks highest according to 
rarity and charisma, whereas the serval (Leptailurus serval) would get the 
preservation nod by virtue of having the narrowest range and being phy- 
logenetically most distinctive among the cat species considered. 

Second, different ranking criteria do not always associate in the same 
way. For example, phylogenetic distinctiveness and narrow range jointly 
support high conservation priority for the giant panda and the serval with- 
in their respective clades, but they conflict in the particular marine turtle 
species they earmark for conservation priority. Third, subjective or other- 
wise questionable judgments often come into play, as, for example, in rank- 
ing brown bears (Ursus arctos) versus polar bears (U. maritimus), or 
American horseshoe crabs (Limulus) versus Asian horseshoe crabs 
(Carcinoscorpius and Tüchypleus), according to their ecological significance 
or charisma. 

In principle, prioritization rankings for conservation could also apply 
across evolutionary groups. Suppose, for example, that again, a total of 
only four species from Figure 9.13 could be the subject of conservation 
efforts, but that the choice is no longer constrained to one species from each 
taxonomic array. Then, the more difficult or subjective of the ranking crite- 
ria (ecological significance and charisma) provide little assistance as choic- 
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Figure9.13 How the conservation value of a species can vary according to five 

conventional ranking criteria. Shown are time-dated phylogenies (estimated from : 
molecular genetic information in conjunction with fossil or other evidence, and 
depicted on a common temporal scale) for surveyed extant species of bears 
(O'Brien 1987; O'Brien et al. 1985a), cats (O'Brien et al. 1996), marine turtles 
(Bowen et al. 1993a; Dutton et al. 1996), and horseshoe crabs (Avise et al. 1994; 
Lynch 1993). Within each taxonomic group, the top-priority species according to 
each ranking criterion (among those species listed in large type) is indicated by a i 
black circle. Gray circles indicate ties for top ranking. (After Avise 2004d.) 
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es are forced between preserving, for example, the polar bear, tiger, or 
leatherback turtle. Even for the objective criteria that are directly compara- 
ble across groups and quantifiable (i.e., rarity, restricted distribution, and 
phylogenetic distinctiveness as measured by IEH), conservationists might 
well decide not to abide by the final numerical rankings. For example, each 
of the four living species of horseshoe crabs has a higher IEH score than do 
any cats or bears, yet I doubt that many people would choose to direct 
finite conservation resources toward these crabs if this came at the expense 
of saving tigers or giant pandas. 


CONCLUSIONS ABOUT PHYLOGENETIC DISTINCTIVENESS AND CONSERVATION. 
Notwithstanding the considerable academic attention that has been paid to 
incorporating phylogenetic distinctiveness as a guide to preservation pri- 
orities at supraspecific levels, it appears that this criterion has had little 
practical effect on conservation efforts at these scales. As illustrated above, 
phylogenetic distinctiveness in practice often conflicts with various other 
quantifiable criteria (such as rarity and restricted distribution), as well as 
with subjective judgments about biological worth, that are not likely to be 
abandoned as primary bases for conservation prioritization. So, whereas 
phylogenetic uniqueness may bolster the rationale for particular conserva- 
tion choices when it agrees with other ranking criteria, it will seldom over- 
ride those other considerations when they are in conflict. This conclusion 
holds with even greater force with regard to rank-ordering species across 
disparate taxonomic groups. Thus, although the phylogenetic distinctive- 
ness of, say, bears and horseshoe crabs can be quantified and objectively 
compared, more subjective criteria (such as charismatic appeal) will 
undoubtedly be applied to such "apples and oranges" in most conservation 
decisions. 

There are additional reasons why phylogenetic considerations are 
unlikely to revolutionize on-the-ground conservation practices at the levels 
of genera and higher taxa. Based on a quantitative analysis that considered 
the full phylogenetic panorama of life, Nee and May (1997) concluded that 
about 80% of life's total independent evolutionary history (IEH) could be 
preserved even if about 95% of extant species were to go extinct. In more 
good news and bad news for conservation efforts, their analysis further 
showed that the fraction of total IEH preserved would not be improved 
much by intelligent phylogenetic choice, as opposed to random draws, of 
the species permitted to survive. 

Every species alive today traces back through an unbroken chain of 
ancestry over the eons, irrespective of how many speciation events have 
intervened along its phylogenetic journey. The mere fact that issues of con- 
servation priority and preservation triage must be raised represents a sad 
and shameful commentary on how humanity's environmental impacts have 
endangered Earth's natural genetic heritage. 
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Conclusion 


I want to close this chapter, and the book, by reiterating a sentiment expressed 
by E. O. Wilson in the quote that opened this chapter. Modern biology has 
indeed produced a genuinely new way of looking at the world. The molecu- 
lar perspectives emphasized in this text do not supplant traditional i 
approaches to the study of natural history and evolution, but rather enrich 
our understanding of life. Therein lies the greatest value of molecular meth- 
ods in conservation biology, or elsewhere. To the degree that we come to 
understand and appreciate other organisms, we will increasingly cherish 
Earth's biological heritage, and our own. As stated by the late Stephen J. 
Gould (1991), "We cannot win this battle to save species and environments 
without forging an emotional bond between ourselves and nature as well— 
for we will not fight to save what we do not love. ... We really must make 
room for nature in our hearts." 

Think back to even a few of the fascinating organisms whose natural his- 
tories and evolutionary patterns have been elucidated using molecular 
markers—honey mushrooms and their giant clones on the floors of northern 
forests; hybridogenetic live-bearing fishes in the arroyos of northwestern 
Mexico and the substantial evolutionary ages that some of these unisexual 
biotypes have achieved; the various sunfish species of the eastern United 
States and their unsuspected and sometimes devious means of achieving 
parentage; naked mole-rats in the deserts of Africa, with their eusocial 
behaviors and tight fabrics of kinship; and female green turtles who, after 
decades in the open ocean, swim thousands of kilometers to return faithful- 
ly to nest at their natal sites. If this book has accomplished nothing else, I 
hope that it may have engendered an increased awareness, respect, and love 
of the planet's marvelous genetic diversity. 


SUMMARY 


1. Many discussions of genetics in conservation biology have centered on the 
topic of heterozygosity or related measures of the within-population compo- 
nent of genetic variation. Molecular heterozygosity is indeed exceptionally low 
in many rare or endangered species, presumably because of genetic drift and 
inbreeding accompanying severe population reductions. 


2. Although it is tempting to manage endangered populations for enhancement 
of genetic variation, in most cases causal links between heterozygosity (as esti- 
mated by molecular markers) and fitness have proved difficult to establish. 
For these and other reasons, some authors have argued that behavioral and 
demographic issues should take priority over heterozygosity issues in man- 
agement programs for endangered species. In truth, both classes of concerns ! 
are important. 


3. At the intraspecific level, molecular markers have found forensic application 
by allowing researchers to identify and track individuals and determine the 
gender of particular specimens. Such information, which is often not other- 
wise evident, can be critical in helping to monitor and manage threatened 
populations. 
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4. Molecular approaches can also serve the field of conservation biology by 

; revealing genealogical relationships among populations of rare or endangered 
species. Such phylogenetic assessments can range from genetic parentage 
assignments in captive breeding programs, to the identification of manage- 
ment units (MUs) and evolutionarily significant units (ESUs) of particular wild 
Species, to the characterization of major sources of regional phylogeographic 
diversity around which management guidelines and natural reserves might be 
established. 


5. Particularly in the field of fisheries biology, molecular markers are widely 
employed to identify and characterize population stocks under exploitation. 
Contributions of genetic stocks to mixed fisheries can be quantified using 
molecular markers. Molecular approaches to stock assessment also have stim- 
ulated thought about the nature of information provided by inherited markers 
versus acquired markers (such as physical tags), and about the key distinction 
between evolutionarily deep versus shallow population genetic structures. 





6. Some conservation programs for endangered species have been directed 
toward taxa whose evolutionary distinctiveness has been questioned by 
molecular genetic reappraisals. In other cases, endangered "species" that were 
taxonomically suspect have proved upon molecular reexamination to be high- 
ly distinct genetically, thus adding to the rationale for special preservation 
efforts. 


N 


. By enabling species identification from even small or degraded bits of tissue 
(such as fish eggs or turtle meat from illegally harvested endangered taxa), 
molecular markers provide powerful forensic tools for government agencies 
charged with enforcement of wildlife legislation. 


8. Hybridization and genetic introgression, often documented by molecular 
markers, have raised a variety of biological as well as regulatory issues regard- 
ing the status of many rare and endangered species. 


9. Molecular appraisals of species-level and higher-level phylograms can be used 
to quantify the magnitude of any taxon's independent evolutionary history 
(IEH) or phylogenetic distinctiveness, which often has been promoted as a cri- 
terion for conservation prioritization. However, several other important priori- 
tization criteria often conflict with IEH measures and thereby compromise the 
utility of phylogenetic distinctiveness (especially above the species level) as a 
pragmatic guide to conservation efforts. 
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Gopherus, 512-513 
Heteronotia, 394-395 
Lacerta, 9, 11 
Lachesis, 512-513 
Lepidochelys, 270, 519, 535 
Macroclemys, 528-529 
Malaclemys, 309 
Natator, 534, 535 
Phrynosoma, 375 
Sceloporus, 328, 387 
Sphenodon, 520, 536 
Sternotherus, 524-525 
Tympanocryptis, 524—525 
Uma, 9, 11 
Uta, 223 
Xerobates, 512-513 
Amphibians 
Ambystoma, 182, 396 
Bufo, 245, 340 
Gastrotheca, 357 
Hydromantes, 11 
Hyla, 11, 327, 367-369, 375, 379 
Litoria, 11, 512-513 
Plethodon, 11, 358, 392 
Rana, 11, 195, 375, 394, 396 
Scaphiopus, 11 
Taricha, 11 
Triturus, 379 
Xenopus, 446 
Fishes 
Acanthochromis, 260 
Acipenser, 280, 528-529 
Amia, 306-308 
Anguilla, 75, 260-261, 423 
Anthias, 233 
Astyanax, 37 
Auxis, 416 
Barbus, 295 


Bathygobius, 11 
Brevoortia, 280 
Campostoma, 11 
Catostomus, 392 
Centropristis, 280, 309 
Chaetodon, 310 
Chanos, 261 
Cichlasoma, 347 
Coregonus, 11, 349, 375 
Coryphaena, 416 
Cottus, 214, 347 
Cynoscion, 521 
Cyprinodon, 11, 366, 508 
Elacatinus, 264 
Embiotoca, 260 
Etheostoma, 11, 214 
Euthynnus, 416 
Fundulus, 40-41, 171, 303-304, 
309 
Galaxias, 339 
Gambusia, 248, 306-308, 388, 
481 
Gasterochisma, 416 
Gasterosteus, 350 
Gempylus, 416 
Gila, 375, 392 
Hesperoleucus, 333 
Hippocampus, 216 
Hoplostethus, 262 
Huso, 528—529 
Hybognathus, 519 
Hypentelium, 11 
Ilyodon, 347 
Istiophorus, 416 
Katsuwonus, 416 
Latimeria, 409, 512—513 
Lavinia, 333 
Lepidocybium, 416 
Lepomis, 11, 214, 215, 306-308, 
315, 331, 344, 364, 380 
Luxilus, 375, 392 
Makaira, 416 
Medialuna, 260 
Melanotaenia, 524-525 
Menidia, 11, 171, 394-395 
Micropterus, 214, 215, 306-308, 
364, 375, 521 
Neolamprologus, 392 
Nerophis, 216 
Notropis, 11, 344 
Oncorhynchus, 348, 364, 494, 
504, 527-528 
Opsanus, 309 
Oreochromis, 375 
Osmerus, 294 
Phoxinus, 394-395 
Plecoglossus, 503 
Poecilia, 393, 394-395 


Poeciliopsis, 183, 393-395, 
397-398, 480, 509 

Pomatoschistus, 214 

Priapella, 417 

Pylodictis, 521 

Rivulus, 171 

Ruminia, 256-257 

Ruvettus, 416 

Salmo, 222, 348, 502 

Salvelinus, 363, 375, 471, 527 

Sarda, 416 

Scaphirhynchus, 518, 524-525 

Sciaenops, 521 

Scomber, 416 

Scomberomorus, 416 

Sebastes, 342, 521 

Sphyraena, 416 

Spinachia, 214 

Stegastes, 261 

Syngnathus, 213, 216 

Tetrapturus, 416 

Theragra, 261 

Thoburnia, 11 

Thunnus, 357, 416 

Trichiurus, 416 

Xiphias, 416 

Xiphophorus, 416-417 

Ascidians 
Botryllus, 193, 266 
Diplosoma, 193 


ECHINODERMATA (starfish, sea 
urchins, and allies) 
Coscinasterias, 174 
Cucumaria, 358 
Echinometra, 263 
Echinothrix, 261 
Heliocidaris, 260 
Psolus, 358 
Strongylocentrotus, 261, 263 


ARTHROPODA 
Insects 
Acyrthosiphon, 347, 355 
Aedes, 359 
Agelaia, 238 
Anartia, 379 
Anopheles, 359—360 
Apis, 237, 238, 281 
Aquarius, 268 
Bacillus, 398 
Caledia, 375 
Camponotus, 238 
Cerceris, 238 
Chaitophorus, 355 
Chorthippus, 305 
Cicindela, 309, 342, 512-513, 
522-523 
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Colias, 40-41 

Collops, 269 

Drosophila, 6, 14, 16, 19, 37, 
40-41, 42, 84, 123, 124, 153, 
154-156, 278-279, 330, 332, 
333-338, 355, 356, 375, 
376-377, 379, 429, 458, 467 

Euhadenoecus, 269 

Eurosta, 347 

Euylaeus, 412 

Formica, 238 

Greya, 347 

Gryllus, 375, 383 

Hadenoecus, 269 

Halobates, 268 

Heliconius, 296-297 

Hemileuca, 522-523 

Lasioglossum, 239, 412-413 

Laupala, 335 

Locusta, 225 

Macrosiphum, 355 

Malacosoma, 241 

Melaphis, 355 

Microstigmus, 238 

Mimulus, 335 

Mindarus, 355 

Myrmica, 238 

Myzus, 171, 355 

Nasonia, 335 

Nicrophorus, 481 

Nothomyrmecia, 238 

Ophraella, 354 

Parachartergus, 237, 238 

Pemphigus, 355 

Plagiodera, 222 

Poecilimon, 225 

Polistes, 237, 355 

Polybia, 238 

Prodoxus, 347 

Quadraspidiotus, 356 

Rhagoletis, 346-347 

Rhopalosiphum, 355, 393 

Rhytidoponera, 237, 238 

Scaptomyza, 429 

Schizaphis, 355 

Schlectendalia, 355 

Sitobion, 171 

Solenopsis, 238 

Trachyphloeus, 356 

Uroleucon, 355 

Crustaceans 

Alphaeus, 263 

Alpheus, 356 

Artemia, 407 

Calanus, 262 

Cambarus, 522-523 

Carcinus, 356 

Chthamalus, 356 

Cyprinotus, 180 
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Daphnia, 178 
Darwinula, 180 
Diaptomis, 262 
Haptosquilla, 262 
Homarus, 263 
Jasus, 261 
Lithodes, 405 
Orconectes, 364 
Pagurus, 407 
Panulirus, 358 
Paralithodes, 405, 521 
Penaeus, 261, 356 
Synalpheus, 239-240 
Tigriopus, 40-41, 226, 263, 266, 
305, 337 
Other Arthropods 
Carcinoscorpius, 537 
Limulus, 262, 309, 537 
Proctolaelaps, 458 
Tachypleus, 537 


NEMATODA (roundworms) 


Ostertagia, 288 


MOLLUSCA (bivalves, snails, and 


allies) 
Achatinella, 522-523 
Amblema, 522-523 
Bathymodiolus, 261 
Biomphalaria, 200 
Bulinus, 200 
Busycon, 196 
Busycotypus, 196 
Campeloma, 393 
Cerion, 366 
Crassostrea, 39—40, 264, 309, 

522-523 
Deroceras, 201 
Geukensia, 309 
Goniobasis, 266 
Helix, 267 
Lasmigona, 512—513 
Liguus, 200, 479 
Lithasia, 522-523 
Littorina, 260 
Mercenaria, 387 
Mytilus, 40-41, 264, 375 
Nucella, 260 
Siphonaria, 263 
Thiara, 177 


ANNELIDA (segmented worms) 


Capitella, 356 
Octolasion, 176 
Phyllochaetopterus, 259 
Spirorbis, 259 


ECTOPROCTA (bryozoans) 


Cristatella, 170 


PLATYHELMINTHES (flatworms) 


Schistosoma, 354 


CNIDARIA (corals, jellyfish, and 


allies) 
Acropora, 170, 260 
Actinia, 170, 175 
Alcyonium, 175 
Anthopleura, 356 
Balanophyllia, 260 
Epiactis, 200 
Goniastrea, 200 
Hydractinia, 193 
Metridium, 175 
Montastrea, 356 
Montipora, 176 
Nematostella, 175 
Oulactis, 175 
Paracyathus, 260 
Plexaura, 176, 177 
Pocillopora, 175, 260 
Seriatopora, 170, 260 
Stylophora, 260 
Tubastraea, 170 


PORIFERA (sponges) 


Lucetta, 522-523 
Niphates, 176 


FUNGI 


Armillaria, 174 

Candida, 188, 189 
Coccidioides, 305 
Crumenulopsis, 170 
Fusarium, 460, 462 
Gibberella, 462 

Lentinula, 522-523 
Neurospora, 462 

Puccinia, 188, 354 
Saccharomyces, 188, 446, 462 


TRACHAEOPHYTA 


Angiosperms (flowering plants) 
Acacia, 522-523 
Aechmea, 172 
Aesculus, 381 
Agrostis, 248 
Anthoxanthum, 248 
Arabidopsis, 42, 328, 439, 441 
Arabis, 170 
Argyroxiphium, 375, 493 
Artemesia, 387 
Avena, 249 


Bensoniella, 482-483 


Beta, 441 
Betula, 172 
Brassica, 375, 439 
Calophyllum, 221 
Cerastium, 481 
Ceratophyllum, 441 
Cercocarpus, 527 
Chamaelirium, 220 
Clarkia, 218, 334 
Cucurbita, 221 
Datisca, 417-418 
Delphinium, 527 
Dubautia, 375 
Encilia, 390 
Epipactis, 522-523 
Erythronium, 334 
Eucalyptus, 375, 382-383, 
482483, 522-523 
Ficus, 193, 357 
Gilia, 218 
Glycine, 327, 441 
Gossypium, 375, 385, 386, 439 
Guara, 334 
Harperocallis, 482-483 
Helianthus, 16, 331, 336, 340, 
375, 382-385, 390 
Heuchera, 327, 375 
Hibiscus, 227 
Howellia, 482-483 
Ilex, 172 
Ipomoea, 227 
Iris, 172, 381-382, 387, 390, 441 
Lasthenia, 390 
Limnanthes, 509 
Lophocereus, 172 
Lupinus, 218, 266 
Lycopersicon, 334 
Lysimachia, 481 
Magnolia, 441 
Mimulus, 16 
Nicotiana, 73, 441, 442 
Nymphaea, 441 
Oenothera, 441 
Pavona, 175 
Pedicularis, 482-483 
Persea, 375 
Piper, 441 
Pisum, 6, 375, 442 
Pithecellobium, 221 
Platanus, 441 
Plectritis, 255 
Populus, 169, 375 
Quercus, 173, 221, 375 
Ranunculus, 441 
Raphanus, 219—220, 227 
Rubus, 170 
Rumex, 196 





Salix, 375 

Sasa, 172 

Saxifraga, 482-483 

Senecio, 327, 425 

Solidago, 172, 173 

Stephanomeria, 329, 388-389 

Swietenia, 221 

Symphonia, 221 

Tachigali, 221 

Taraxacum, 174 

Tellima, 375 

Tragopogon, 327 

Trifolium, 442, 482—483 

Tripsacum, 356 

Triticum, 248 

Trochodendron, 441 

Zea, 6, 356, 375, 441, 446 

Zelkova, 522-523 

Gymnosperms (cone-bearing 

seed plants) 

Abies, 441 

Agathis, 441 

Araucaria, 441 

Cephalotaxus, 441 

Chamaecyparis, 441 

Cryptomeria, 441 

Cycas, 441 

Ephedra, 441 

Ginkgo, 439, 441, 442 

Gnetum, 441 

Juniperus, 441 

Nageia, 441 

Phyllocladus, 441 

Pinus, 221, 244, 258-259, 375 

Podocarpus, 441 

Pseudotsuga, 441 

Sciadopitys, 441 

Taxus, 441 


Welwitschia, 441 
Zamia, 441 

Ferns and other non-seed 
tracheophytes 
Angiopteris, 441 
Asplenium, 327, 441 
Azolla, 441 
Isoetes, 441 
Lycopodium, 441 


BRYOPHYTA (mosses and allies) 
Plagiomnium, 327 


PROTISTS (single-celled 
organisms) 

Dictyostelium, 192-193, 446 
Entamoeba, 188 
Enteromorpha, 170, 173 
Giardia, 188 
Globigerina, 298 
Leishmania, 188 
Naegleria, 170, 188 
Pandorina, 298 
Plasmodium, 188 
Symbiodinium, 357 
Toxoplasma, 184, 188 
Trichomonas, 188 
Trypanosoma, 185, 188, 446 
Turborotalita, 298 
Volvulina, 298 


BACTERIA AND ARCHAEA 
Aeropyrum, 457 
Agrobacterium, 446, 456 
Anacystis, 446 
Aquifex, 457 
Archaeoglobus, 457 
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Bacillus, 191, 446, 457 
Bordetella, 190 
Borrelia, 457 
Bradyrhizobium, 125 
Buchnera, 353, 355 
Chloroflexus, 446 
Deinococcus, 457 
Escherichia, 6, 68, 94, 125, 188, 
190, 446, 456 
Haemophilus, 190, 457 
Halococcus, 446 
Haloferax, 446 
Helicobacter, 457 
Lactococcus, 359 
Legionella, 190 
Methanobacterium, 446, 457 
Methanococcus, 446, 457 
Methanospirillum, 446 
Mycobacterium, 457 
Mycoplasma, 457 
Neisseria, 190-191, 191 
Photobacterium, 125 
Pseudomonas, 446 
Pyrococcus, 457 
Rhizobium, 125 
Rickettsia, 457 
Sulfolobus, 298 
Sulfolobus, 446, 298 
Synechocystis, 457 
Thermococcus, 446 
Thermoproteus, 446 
Thermotoga, 446, 457 
Treponema, 457 
Wolbachia, 354, 383 





165 rRNA gene, 444 

18S rDNA, 445, 446—447 
18S rRNA gene, 439, 444 
50/500 rule, 489 

190-kDA antigen genes, 359 


Aat-1 gene, 278 
Acid phosphatase, 57 
Acipenseriforme fishes, taxonomy 
and conservation genetics, 
522-523 
Acquired characters, in wildlife 
management, 500-502 
Acrylamide gels, 68, 93 
" Adam" human patrilines, 300 
ADH, see Alcohol dehydrogenase 
AFLPs (amplified fragment-length 
polymorphisms) 
in clonal analyses, 174 
and qualitative markers, 105 
spatial distribution of clones, 172 
technique, 94-95 
Agamospermy, 173 
Agarose gels, 68 
AIDS, 316-318 
Albumin 
MCF procedures, 55-57 
molecular clock calibration, 124 
Alcohol dehydrogenase (ADH), 
14, 40-42, 154—155, 278 
Algae, spatial distribution of 
clones, 173 
Alkaline phosphatase, 57 
Allee effects, 487 
Allele frequencies, DNA finger- 
printing, 163-164 


Subject Index 


Allelic disequilibrium, 370 
Allopatric speciation, 349-350, 
351-353 
Allopolyploid species, 327 
Allozyme heterozygosity 
correlational approaches, 37-39 
and organismal fitness, 36-40 
in population bottlenecks, 
479—480 
and protein characteristics, 38 
Allozyme markers 
bacterial clones, 189-191 
in clonal analyses, 172—173, 175, 
176, 188-189, 193 
confirming clonal reproduction, 
170-171 
history of methods, 50 
in parentage analyses, 200-201, 
217, 220, 226, 227 
qualitative, 105 
Allozyme polymorphisms, inter- 
pretation, 61 
Allozymes 
divergence, 331-334 
genetic distance statistics, 107 
molecular clock calibration, 124 
study of genetic variation, 26-29 
Alternative reproductive tactics 
(ARTs), 222-223 
Altruism, 233, 235 
Amazonia, phylogeography of 
mammals, 290 
Amphibians 
genetic distance, 11, 12 
population genetic variation, 252 


sex-biased introgression, 379 
Amplified fragment-length poly- 
morphisms. See AFLPs 

amylase gene, 154, 156 
Analogy, vs. homology, 8 
Ancestor, mean time to common, 


Androdioecy, 417-418 
Androgenesis, in insects, 398 
Animals 
clonal reproduction, 170 
dispersal distance, 266-267 
DNA sequence variation, 43-44 
homeotic genes in, 15 
mtDNA and higher systematics, 
434-438 
parthenogenesis confirmed, 
170-171 
phylogenetics, 434-438 
Pleistocene vicariance, 420-421 
polyploid species, 396-397 
population genetic variation, 252 
speciation by hybridization, 392 
Antibiotic resistance, origin, 17 
Anurans (frogs), vicariance bio- 
geography, 419-420 
Apes, conservation genetics, 510 
ApoB gene, 17 
Apomixis, 170, 173-174 
Apomorphy, 116 
Archaea, phylogeny, 444-447 
Archezoa, origin, 448 
Armadillos, clonal reproduction, 
178 
Armed Forces forensics, 168 
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ARTs (alternative reproductive 
tactics), 222-223 
Ascidians (sea squirts), PCM of 
life habits, 414-416 
Asexual reproduction 
agamospermy, 173 
apomixis, 173-174 
clones confirmed, 169—171 
planula larva, 175 
See also Hermaphroditism; 
Parthenogenesis 
Asexual transmission, evolution- 
ary perspective, 21 
Aspartate aminotransferase, 278 
Associative overdominance, 39-40 
Atlantic coast, phylogeography, 
313-315 
atpB gene, 439 
Australia, convergence of fauna, 
427-428 
Autapomorphy, 116 
Autogamy 
mating systems and population 
structure, 249-257 
and speciation, 328-329 
See also Self-fertilization 
Autoradiographs, DNA elec- 
trophoresis, 68-69 


Background selection, 44, 45 
Bacteria 
clonal reproduction in, 189-192 
disease causing agents, 190-192 
horizontal gene transfer, 456~457 
molecular clock calibration, 125 
molecular taxonomy, 190 
PCM of magnetotaxis, 418 
phylogeny, 444-447 
Balance view, of genetic variation, 
24-25 
Balancing selection 
evidence for, 42 
in gene trees, 146-148 
and genetic polymorphism, 44, 
45 
and population structure, 264 
Bats 
kinship in colonies, 242 
PCM of flight, 403-405 
Bayesian analysis 
phylogenetics, 141-142 
population assignment, 281 
Bdelloid rotifers, age of clones, 
179-180 
Bears 
conservation priorities, 537-538 


individual tracking, 492 
pandas and, 410-412 
phylogeography, 290-292 
Binary characters, 110 
Biodiversity 
appreciation of, 540 
and future of systematics, 360 
and genetic diversity, 475-491 
regional reserves, 514-515 
Biogeography 
and molecular clocks, 128 
vicariance vs. dispersal, 418-426 
See also Geographic population 
Structure; Phylogeography 
Biological species concept (BSC), 
321-325, 361-363 
Biotypes, 180 
Birds 
brood parasitism, 199, 228-229 
convergence in Australian song- 
birds, 427-428 
dispersal vs. vicariance, 425 
DNA-DNA hybridization, 66-67 
DNA hybridization and system- 
atics, 433-435 
early evolution, 428-429 
gender-biased dispersal, 273-275 
Bender identification, 495 
genealogical concordance, 
309-310, 312 
genetic distance, 11, 12, 67 
kinship in social groups, 243-244 
parentage analyses, 197-201, 
206-216 
PCM of nesting habits, 414-415 
philopatry and population struc- 
ture, 272-273 
phylogeography of redwings, 
292-293 
population genetic variation, 252 
protein electrophoresis, 61 
sex typing, 194-195 
speciation times, 352 
vicariance biogeography, 419, 
425 
Bivalve mollusks, allozyme het- 
erozygosity and fitness, 38, 
40 


Black-footed ferret, phylogeny, 
532-534 

Black Sea basin, 295 

Black stilt, gender ídentification, 
495 

BLAST program, 20 

Bony fishes, allozyme heterozy- 
gosity and fitness, 38 


Bourgeois males, fishes, 214-215 
Branch Davidian fire, 168 
Branching process theory, and 
geneology, 284—285 
Breeding guidelines, rare and 
threatened species, 493 
Brood parasitism 
in birds, 199, 228-229 
phylogenetic character mapping, 
412 
Bryozoans, clonal reproduction, 
170 


BSC (biological species concept), 
321—325, 361-363 

Butterflies, Müllerian mimicry and 
phylogeography, 296—297 


CAIC software, 407 
Camin-Sokal parsimony, 141 
Candidate gene, speciation, 334, 
336-337 
Canines, introgression in, 529—531 
Captive populations 
heterozygosity, 478 
inbreeding, 488 
Carboxylesterase, 40-41 
Caribbean region, phylogeogra- 
phy, 423-424 
Carr-Coleman hypothesis, 289 
Catfish, wildlife forensics, 521 
CDH-W gene, 195 
cDNA (complementary DNA), 79 
Cellmark Company, 162 
CERVUS program, 198 
Cetaceans 
kinship within pods, 243 
migration and gene flow, 270 
phylogenetic character mapping, 
408—409 
Chagas disease, 185 
Chaotic patchiness, 262-263 
Character displacement, and eco- 
logical speciation, 350 
Character state discordance, vs. 
horizontal gene transfer, 455 
Character states 
in cladistics, 118, 119 
discrete, 105—110 
distance data, 105-110 
molecular information, 9 
phylogenetic use, 49 
polarity, 110, 119 
quantitative, 14, 16, 334-336, 403 
Charismatic megabiota, 536 
Cheetahs 
genetic variability, 480, 488 
population bottlenecks, 84 





Chimeras, 192-194 
Chimpanzees 
conservation genetics, 510 
genetic divergence from 
humans, 10-13 
individual identification, 
492-493 
phylogenetic relations, 431-432 
Chloroplast DNA. See cpDNA 
Chloroplasts, origin, 449 
Chromosomal rearrangements, 
374-378 
Cichlid fishes 
convergent evolution, 8-9 
evolutionary rates, 48 
speciation, 347, 349-350, 392 
CITES (Convention on 
International Trade in 
Endangered Species), 517 
Citrate synthase gene, 359 
Clade, definition, 117 
Cladistics 
definition, 117 
suitability of molecular data, 
119-120 
use of character states, 118, 119 
use of SINEs, 96-97 
vs. phenetics, 115-120 
Cladogenesis, lineage-through- 
time analyses, 342-343 
Classical school, view on genetic 
variation, 24 
Clinical applications, identification 
of fungal clones, 189 
Clinton-Lewinsky affair, 168 
Clonal reproduction 
confirmation, 169-171 
genets, 169 
and hermaphroditism, 171 
in microorganisms, 183-192 
polyembryony, 178 
population genetic criteria, 
184-185 
questions raised, 169-171 
ramets, 169 
Clones 
ages, 179-183 
bacteria, 189-191 
in fungi, 188-189 
in invertebrates, 169, 179-180 
phenotypic identification, 172, 
174, 175-176 
in plants, 169-170 
spatial distributions, 172-178 
vertebrates, 178, 180-183 


See also Asexual reproduction; 
Hermaphroditism; 
Parthenogenesis 

Clonet, 187 

Cluster analysis, phylogenetic 
trees, 134-136 

Clustered mutations, 212 

CMS (cytoplasmic male sterility), 
and introgression, 383 

Cnidaria, mtDNA phylogenetics, 
437 

Co-speciation, host-parasite phy- 
logenies, 353-355 

Coalescent theory 

conservation genetics, 489-490 

and geneology, 284-285 

human lineages, 299 

CODIS (Combined DNA Index 
System), DNA typing, 167 
Coelacanths, phylogenetic charac- 
ter mapping, 409-410 
Colonies, eusocial, 235-241 
Common ancestry 
mean time to, 34 
vs. convergence, 427—429 
Common yardstick rationale, 


Comparative Analyses by 
Independent Contrasts soft- 
ware, 407 

Concerted evolution, 17-18, 83-84 

Connectable data, 110—111 

Consensus trees, pandas and 
bears, 411 

Conservation, pollen sources in 
plants, 221 

Conservation biology 

lessons from phylogeography, 
510-515 
and molecular techniques, 
478-479, 488-491 
and phylogenetic analysis, 
532-539 
(Priorities, 535-539 
and taxonomy, 515-521, 522-525 

Conservation Biology, Society for, 
476 

Conservation genetics, 475 

chronology, 476-477 

and demography, 497-500 

effective population size, 
489-490 

fisheries stocks, 502-504 

gender identification, 495 

goals, 478 

historical population size, 496 
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identifying individuals, 491-492 
inherited vs. acquired markers, 
500-502 
introgression, 527-532 
parentage and kinship, 492-495 
population management cau- 
tions, 499-500 
population structure and phylo- 
geography, 497-515 
shallow vs. deep population 
structures, 505-510 
wildlife forensics, 521-526 
Continuous traits. See Quantitative 
traits 
Convention on International Trade 
in Endangered Species 
(CITES), 517 
Convergent evolution 
Australian fauna, 427-428 
flight in bats, 403-405 
homology and analogy, 8~9 
vs. common ancestry, 427-429 
Correlations (genetic) 
allozyme heterozygosity and fit- 
ness, 37-39 
in human forensics, 165 
Cougars 
genetic diversity, 481, 531 
individual tracking, 492 
Cowbirds, PCM of brood para- 
sitism, 412 
cox1 genes, 458 
Coyote 
individual identification, 492 
introgression, 529 
phylogeny, 530 
cpDNA (Chloroplast DNA) 
in clonal analyses, 174 
in phylogenetic analysis, 78-79, 
113 
and plant systematics, 438-443 
and reticulate evolution, 384—386 
tobacco, 73 
Crab phylogeny, rRNA analysis, 
407 


Creatine kinase, 62 
Crop plants, paternity analyses, 
221 


Crossopterygia, phylogenetic 
character mapping, 409-410 
Crustaceans, population genetic 
variation, 252 
Cryptic species 
molecular diagnoses, 356-361 
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zooxanthellae endosymbionts, 
357 
Cuckoldry, in fishes, 214, 215 
Cyanobacteria, origin of chloro- 
plasts, 449 
Cytochrome b, DNA sequences, 
101-103 
Cytochrome b gene, 412 
Cytochrome c, 337 
Cytochrome c oxidase, 337 
Cytonuclear disequilibria, 186 
Cytoplasmic capture 
and introgression, 372-376 
and reticulate evolution, 383-386 
Cytoplasmic disequilibrium, 378 
Cytoplasmic genomes, 21 
Cytoplasmic male sterility (CMS), 
383 


Darwin’s finches, 430 
Data management, 111 
Death Valley model, 508-509 
Deer, allozyme heterozygosity and 
fitness, 38, 39 
Degenerative disease, 21 
Demography 
and conservation genetics, 
497-500 
and geneology, 284-285 
historical, 279—280, 489-490, 
495—496 
Dengue fever, identifying vector 
species, 359 
Detached data, 110-111 
DGGE (denaturing gradient gel 
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evolutionary perspective, 21 
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phylogenetic resolution, 113 
technique, 63-67 
use in phylogenetic study, 52 
DNA electrophoresis, restriction 
analysis technique, 67-70 
DNA fingerprinting 
clonal population structures, 
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Egg-sperm interactions, tests of 
selection, 44 
Egg thievery, in fishes, 214 
Electron transport system (ETS), 
337 
Electrophoresis. See DNA elec- 
trophoresis; Protein elec- 
trophoresis 
Electrophoretic types (ETs), bacte- 
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Fossils, molecular clock data, 128 
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674 Subject Index 


Freshwater animals, and popula- 
tion structure, 262 
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419—420 
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Gametic dispersal, 257—265 
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Garden of Eden scenario, 299 
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496—497, 499 
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and introgression, 376-378 
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combined species concept, 
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Genet, definition, 169 
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evolutionary perspective, 21 
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concept of, 106 
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sis, 67 
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Statistics for, 107-109 
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estimating, 9-14 
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and speciation rates, 342-346 
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Genetic diversity 
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population consequences, 
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heterozygosity and fitness, 39 
sizes,7 
views of structure, 24-25 
Genomic transfers, 450-452 
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249-257 
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279-280 
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gender-biased dispersal, 273-277 
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philopatry, 269-273 
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Hermaphroditism 
and clonal reproduction, 171 
in fishes, 171 
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HGT. See Horizontal gene transfer 
Histochemical stains, 59 
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HIV viruses, 316-319 
HKA test, 43 
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in mtDNA, 72, 78 
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sexual bias in, 367-370 
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450-451 
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Inclusive fitness, 232-233 
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identifying, 490-492 
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genealogical concordance, 
309-310 
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heterozygosity and selection, 37 
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Linkage maps, sunflowers, 391 
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Mammals 
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human “Eve” theory, 299-300 
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Meiotic drive, 46 
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Mendelian markers 
microsatellites, 92-93 
from protein electrophoresis, 
* 50-61 
Meningitis, 190 
Metazoan phylogeny, 185 rDNA 
sequences, 444-445 
Methicillin resistance, in 
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Microbes 
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Mixed-mating model, in plants, 
217 





678 Subject Index 
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Molecular Evolutionary Genetic 
Analysis (MEGA) software, 
109 
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fossil proteins, 466 
humans, 469-470 
Molecular tags, 4 
Molecular variability. See Genetic 
divergence; Genetic diversity; 
Genetic variation 
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Mollusks 
gamete recognition, 337 
population genetic variation, 252 
Monogamy 
in fishes, 216 
and multiple concurrent paterni- 
ty, 222 
and sexual selection, 203 
Monophyletic group, 117 
Monophyly, reciprocal, 145-146, 
148 


Morphological stasis, cryptic 
species, 357-358 

Morphological systematics, 
115-116 


Mosaic genomes, 451-452 
Mosquitoes, cryptic species, 
359-360 
Mountain lion. See Cougars 
MPAs (Marine Protected Areas), 
496 
MrBayes program, 141 
mtDNA (Mitochondrial DNA) 
in clonal analyses, 174, 181—183 
gender-biased dispersal, 274-277 
and geneology, 285—289 
philopatry and population struc- 
ture, 269-273 
in phylogenetic analysis, 18-19, 
435—438 
phylogeographic case studies, 
289-301 
polymorphism and effective 
population size, 35 
in population biology, 476 
in RFLP analyses, 50 
mtDNA (animal) 
amplification of genome, 89 
and animal systematics, 434—438 
Characterized, 72-73 
digestion profiles, 74-77 
evolutionary rate, 72, 123-124 
human, 73, 299-300 
interpretation, 76-78 
interpretive errors, 450 
maternal transmission, 74 
phylogenetic analyses, 72-74, 
434-438 
RFLP assay techniques, 70-71, 73 
mtDNA (plant), 78 
MTL gene, 189. 
Muller's ratchet, 179 
Muiti-locus data, 111 
Multi-locus organization, of 
genomes, 256 
Multi-state characters, 110 
MUS (Management units), 505-507 
Mutation rates, 120-123 
and neutrality theory, 36 
See also Molecular clocks 
Mutations 
clustered, 212 
See also Neutrality theory 


Naked mole-rats, eusocial 
colonies, 241 

National Research Council, DNA 
fingerprinting reports, 166 

Natural history, molecular studies, 
49-53 

Natural selection, 23 


balancing selection, 42 

classical vs. balance schools, 
24-25 

on DNA sequence variation, 
41-44 

evidence of non-neutrality, 
278-279 

at molecular level, 46-47 

and neutrality theory, 30-31, 
44-47 

and population structure, 264 

on protein polymorphisms, 
40-41 


selective sweep, 46 
statistical evidence from DNA, 
42-44 
N, See Effective population size 
Neanderthals, fossil DNA, 
469-470 
Neighbor-joining method. See N-J 
method 
Nei’s standard genetic distance, 
107 
Neoclassical theory. See Neutrality 
theory 
Nepotism hypothesis, in armadil- 
los, 178 
Nest takeovers, in fishes, 224 
Neutrality, departures from, 
277-279 
Neutrality theory 
difficulties of, 44-47 
DNA tests, 42-44 
effect of uncertainty, 47 
and molecular clocks, 121 
perspective on evolution, 36 
and population genetics, 31-35 
predictions of evolutionary rates, 
36 
and selection, 30-31, 44-47 
N-] (neighbor-joining) method 
individual population assign- 
ment, 281, 282 
phylogenetic trees, 135, 136-137 
Non-eusocial groups, kinship in, 
241-244 
Non-universal code, evolutionary 
perspective, 21 
notch gene, 154 
Nucleotide diversity 
extent of polymorphism, 30 
measure of heterozygosity, 27 
Nucleotide sequences. See DNA 
sequences 
Numerical taxonomy, 116-117 
Nup96 gene, 337 





O. J. Simpson trial, 168 

OdsH (Odysseus) gene, 336 

Oligonucleotides 

PCR techniques, 89 
restriction analysis techniques, 
67-68 

Orangutans, 510 

Organismal fitness. See Fitness 

Orthology, 18 

Oryx, parentage analyses, 493 

Ostracods, age of clones, 180 

OTUs (Operational taxonomic 
units), 132-133 

Out of Africa hypothesis, 300-301 

Outcrossing, in plants, 217-218 

Outgroup, 117 

Ovalbumin, MCF assay, 57 

Overdominance, heterozygosity 
and fitness correlations, 
39-40 

Overdominance scenario, 485 


P elements, 458 
Paleontology, molecular, 466-471 
Pandas, phylogenetic character 
mapping, 410-412 
Parallel evolution, of mtDNA 
genomes, 437 
Paralogy, 18 
Paraphyletic group, 117 
Paraphyly 
lineage sorting, 145-146, 148 
and speciation, 339-341 
Parasites 
co-speciation, 353-355 
host-switching and speciation, 
346-347 
Parentage, modes of, 197-201 
Parentage analyses 
alternative reproductive tactics 
(ARTs), 222-223 
in birds, 206-216 
concurrent multiple paternity, 
221-222 
in conservation genetics, 492-495 
DNA fingerprinting, 85 
in fishes, 212-216 
in humans, 204 
maternity analyses, 227-229 
mating behavior, 202-204, 212 
in plants, 216-221 
pollen competition, 224-227 
pollen dispersal, 266 
population size estimates, 
229-230 


in primates, 204-206 
sexual selection, 202-204 
sperm competition, 224-227 
sperm storage, 223-224 
statistical techniques, 197-198 
types of parentage, 196-202 
Parental investment (PI) 
in fishes, 214 
strategies in birds, 209-210 
PARENTE program, 198 
Parsimony 
algorithms, 120 
phylogenetic trees, 139-141 
Parthenogenesis 
confirmation of, 170-171 
invertebrates, 176-178, 179, 180 
parentage analyses, 201 
vertebrates, 181 
Parthenogenetic speciation, 
393-395 
Paternal care, in fishes, 212, 
214-216 
Paternity analyses 
concurrent multiple paternity, 
221-222 
in plants, 219-221 
in primates, 204-205 
See also Parentage analyses 
Pathogenesis, 191-192 
PATRI program, 198 
Patrilineal geneology, 285 
Patristic similarity, 116 
PAUP" (Phylogenetic Analysis 
Using Parsimony) program, 
140 . 
PCM (Phylogenetic character 
mapping) 
androdioecy in plants, 417-418 
Ascidian life habits, 414-416 
behavior, 412-418 
cetaceans, 408-409 
challenges to, 403 
. definition, 48, 402 
diversity of, 420-421 
endothermy in fishes, 414-416 
independent contrasts, 406—407 
magnetotaxis in bacteria, 418 
and morphology, 403—412 
nestíng habits in birds, 414—415 
phylogeography of lizards, 
295-296 
in sweat bees, 412-413 
swordtail fishes, 416417 
PCR (Polymerase chain reaction) 
advantages and disadvantages, 
89-91 
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AFLPs, 94-95 
in DNA fingerprinting, 164, 476 
DNA sequencing, 98-103 
HAPSTRs, 98 
history of, 50-52 
RAPDs, 91-92 
SINEs, 96-97 
SNPs, 97-98 
SNPSTRs, 98 
sources of DNA recovered, 90-91 
SSCPs, 97 
in STRs (microsatellites), 92-95 
technique, 87-89 
Pennsylvania v. Pestinikas, DNA 
evidence, 167 
Periodic selection, 187, 190 
Pgi gene, 165 
Pgm gene, 265 
Phenetic similarity, 116 
Phenetics, vs. cladistics, 115-120 
Phenograms, 135, 231 
Phenotypic evolution, rates of, 
48-49 
Phenotypic traits 
clonal identification, 172-176 
molecular characterization, 
14-17 
Phenylketonuria, 17 
Philopatry, 269-273 
Phosphoglucomutase (PGM) pro- 
tein, 60 
6-Phosphogluconate dehydroge- 
nase, 40-41, 265 
Phyletic gradualism, 342-345 
and speciation, 330 
PHYLIP (phylogeny inference 
package) program, 140 
Phylogenetic analysis, 4-5 
applicability of methods, 112-113 
biogeographic assessment, 
418-431 
cladistics vs. phenetics, 115-120 
common ancestry vs. conver- 
gence, 427-429 
conservation priorities, 535-539 
cpDNA, 438-443 
history of molecular methods, 
49-53 
homology vs. analogy, 8-9 
interpreting discrete characters, 
10. 
king crabs and hermit crabs, 
405-407 
mtDNA in animals, 434-438 
phylogenetic character mapping, 
402—403 
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protein markers, 61-63 
rationale for, 402 
rDNA, 443-444, 445—447 
speciation signatures, 341—343 
use of SINEs, 96-97 
vicariance biogeography, 
418-426 
Phylogenetic character mapping. 
See PCM 
Phylogenetic species concept 
(PSC), and the BSC, 361—363 
Phylogenetic trees, 132-143 
Bayesian analysis, 141-142 
character-based, 139-142 
distance-based, 134-139 
evaluation of, 142-143 
Maximum likelihood (ML) 
methods, 141 
maximum parsimony, 139-141 
Neighbor-joining method, 135, 
136-137 
temporal information, 463—464 
UPGMA /Cluster analysis, 
134-136 
Phylogeny, 4-5 
of black-footed ferrets, 532-534 
early life, 451—452 
of elephants, 510-511 
of Eucarya, 444- 447 
host-parasite concordance, 
353-355 
of marine turtles, 534-535 
microtemporal, 316-319 
wolves and coyotes, 530 
Phylogeography, 283 
branching processes and coales- 
cence, 284—285 
case studies, 289-301 
common ancestry vs. conver- 
gence, 427—429 
comparative studies, 310-313 
conservation principles, 510-515 
early evolution of birds, 428-429 
early evolution of mammals, 429 
and fisheries management, 
502-504 
genealogical concordance, 
301-314 
genealogical discordance, 
314-316 
history of, 286-288 
of humans, 298-301 
and plate tectonics, 425-426 
and population structures, 
497-499, 505-510 


rare and threatened species, 
512-513 
Phylograms, 463, 465 
PL See Parental investment 
Plain pigeons, genetic variation 
and fitness, 487 
Plankton 
dispersal and population struc- 
ture, 259-265 
identification, 358-359 
Plants 
allozyme-based estimates of het- 
erozygosity, 28 
clones, 169-170, 172-174 
conservation genetics, 509 
cpDNA and phylogenetics, 
438-445 


dispersal distances, 266, 425 
DNA sequence variation, 4344 
gender-biased dispersal, 277 
gene flow and introgression, 
381-383 
genetic chimeras, 192, 193 
genetic distance and speciation, 
334 
genetic population structure, 250 
genetic swamping, 527 
geographic population structure, 
248-257 
hermaphroditism, 216-219 
and horizontal gene transfer, 459 
intracellular gene transfers, 451 
introgression and asymmetric 
gene flow, 381-383 
kinship questions, 244 
mtDNA limitations, 78 
origin of land plants, 443 
paternity assignment, 219-220 
PCM of androdioecy, 417-418 
Pleistocene vicariance, 20 
pollen competition, 227 
sex determination, 196 
spatial distribution of clones, 
172-174 
speciation by hybridization, 
388-391 
Planula larva, asexual reproduc- 
tion, 175 
Plastids, origin, 447-450 
Plate tectonics, and phylogeogra- 
phy, 425 
Pleistocene age, vicariant biogeog- 
raphy, 420-421 
Plesiomorphy, 116 
PNP (Purine-nucleoside phospho- 
tylase), 60 


Pocket gophers 
conservation genetics, 518 
mtDNA phylogeny, 287 
pol gene, 19, 319 
Polarized characters, 110, 119 
Pollen 
dispersal, 257-259, 266, 277 
paternity analyses in plants, 
220-221 
and population structure, 
257-259 
Pollen competition, 227 
Polyandry 
in fishes, 216 
and sexual selection, 203 
Polyembryony, in mammals, 178 
Polygenic traits, and phylogenetic 
character mapping, 403 
Polygynandry 
in fishes, 216 
and sexual selection, 203 
Polygyny 
and multiple concurrent paterni- 
ty, 222 
and sexual selection, 203 
Polymorphism 
vs. speciation, 347 
See also DNA polymorphism, 
mtDNA, Protein polymor- 
phisms í 
Polypetides, molecular clock cali- 
bration, 124 
Polyphyletic group, 117 
Polyphyly, 145-146, 148 
Polyploidy, and speciation, 
327-328, 396-398 
Pontiac fever, 190 
Population assignments, of indi- 
viduals, 180—181 
Population bottlenecks 
and effective population size, 32 
and genetic diversity, 478—484 
and genetic drift, 37 
and MHC genes, 84 
See also Founder events 
Population genetics 
criteria of clonal reproduction, 
184-185 . 
and neutrality theory, 31-35 
and protein electrophoresis, 
27-29 
Population hierarchy, evolutionary 
perspective, 21 
Population size, 32-33 
estimates and parentage analy- 
ses, 229-230 


See also Effective population size; 
Inbreeding depression; 
Population bottlenecks 

Population structure 

and DNA fingerprinting, 
165-166, 176-178 

statistics of, 251 

sweepstakes reproduction, 
262-263 

See also Geographic population 
structure, Phylogeography 

Population viability analyses 
(PVA), 476 

Porphyria, variegate, 17 

Postzygotic barriers, reproductive 
isolation, 324 

Prairie chicken, population bottle- 
neck and fitness, 487 

Prairie dogs, kinship in colonies, 
242 

Prezygotic barriers, reproductive 
isolation, 324 

Primary hybrid origin hypothesis, 
396-398 

Primates 

chimeras in, 194 . 

parentage analyses, 204-205 

See also Humans 

Primers 

in microsatellite assays, 93 

in PCR techniques, 89, 93, 96 

in SINEs, 96 

Probes, 79 
PROBMAX program, 198 
Prokaryotes 
horizontal gene transfer, 456-458 
phylogenetic lineages, 446—447 
Protein assays, vs. DNA-level fea- 
tures, 104-105 
Protein electrophoresis 

comparability of data, 110-111 

conservation genetics, 476 

estimating genetic distance, 
10-12 

evaluation of variants, 29-30 

history of, 50 

Mendelian markers, 59-61 

phylogenetic interpretations, 
61-63 

phylogenetic resolution, 112 

sources of bias, 30 

studies of genetic variation, 
26-29 

technique, 57-59 

Protein immunology 

phylogenetic resolution, 113 


in phylogenetic study, 52 
technique, 55-57 
Protein polymorphisms 
allozyme surveys, 26-29 
from electrophoresis, 61 
vertical approaches, 40—41 
Proteobacteria, origin of mito- 
chondria, 449-450 
Protists 
clonal reproduction, 170 
DNA sequence variation, 43-44 
phylogenetic diversity, 448—449 
Protozoans 
clonal agents of disease, 188 
clonal reproduction in, 183-188 
recombination in, 187 
Puma. See Cougars 
Punctuated equilibrium, 329, 
342-345 
Pupfishes, conservation genetics, 
508 


Purple proteobacteria, origin of 
mitochondria, 449—450 

PVA (Population viability analy- 
ses), in conservation genetics, 
476 


QTLs (Quantitative trait loci) 

mapping, 14, 16 

speciation analysis, 334—336 
Quagga, fossil DNA, 467-468 
Qualitative character states. See 

Discrete characters 

Quantitative traits 

and phylogenetic character map- 

ping, 403 
QTLs, 14, 16 
speciation analysis, 334-336 


Radioactive labeling, technique, 
68-69 


Ramet, definition, 169 
RAPDs (Randomly amplified 
polymorphic DNAs), 91-92 
fungal clones, 189 
and qualitative markers, 105 
spatial distribution of clones, 
172, 174, 175 

Rape, in primates, 206 

Rare and threatened species 
breeding guidelines, 493 
gender identification, 495 
genetic diversity in, 479-484 
genetic swamping, 527-532 
management programs, 491 
phylogeography, 512-513 
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population structure of plants, 
509 
Rates of evolution, See 
Evolutionary rates; Molecular 
clocks 
Rats, molecular clock, 130 
rbcL gene, 73, 418, 439 
rDNA 
in clonal analyses, 174 
diversity in zooxanthellae, 357 
phylogenetic analyses, 443-444, 
445-447 
RFLP analysis, 83-84 
Reciprocal monophyly, 145-146, 
148 


Recombination 
classical vs. balance views, 24, 25 
genetic phase disequilibrium, 
186-187 
indicators of clonal reproduc- 
tion, 184-185 
restrictions on, 256 
Recombinational speciation, 
388-389 
Red deer, heterozygosity and fit- 
ness, 39 
Red-winged blackbirds, phylo- 
geography, 292—293 
Regional reserves, 514-515 
Relatedness. See Genetic related- 
ness, Kinship 
Relatedness program, 232 
Reproductive isolating barriers 
(RIBs), 321-325 
and combined species concept, 
362 
reproductive isolating genes, 338 
Reproductive technology, in con- 
servation genetics, 477 
Reptiles 
genetic distance, 11, 12 
kinship in social groups, 243 
population genetic variation, 252 
Restoration genetics, 496 
Restriction analyses 
statistics for, 107-108 
technique, 67-70 
See also RFLPs 
Restriction site matrix, mtDNA, 77 
Reticulate evolution, 4 
and cpDNA phylogeny, 442 
and cytoplasmic capture, 
383-386 
Retropseudogenes, SINES, 96 
Retrotransposable elements. See 
RTEs 
Retroviruses (RVs), 460 
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Reverse transcriptase (RT), 
460-461 
RFLPs (restriction fragment-length 
polymorphisms) 
and animal mtDNA, 70-78 
bacteria, 191 
fungal clones, 189 
genetic variability in cheetahs, 
480 
history of study, 50 
and plant organelles, 78-79 
‘and qualitative markers, 105 
repetitive gene families, 83-84 
and scnDNA, 79-82 
technique, 68-70 
See also Minisatellites 
Rhinoceros, conservation genetics, 
484, 507-508 
RIBs. See Reproductive isolating 
barriers 
Ring species, 331-333 
RNA, in clonal analyses, 188 
RNA genes. See rDNA 
RNA viruses, evolutionary rates, 
316-317 
Rockfishes, wildlife forensics, 521 
Rogers's distance, 107 
Root of life, 447 
Rotifers, age of clones, 179-180 
Rpm1 gene, 42 
rRNA (ribosomal RNA) 
identifying picoplankton, 
358-359 
molecular clock calibration, 124 
cRNA genes 
185 subunits, 439, 444 
concerted evolution, 17-18 
domains of life, 444 
RT (Reverse transcriptase), 
1 


RTEs (Retrotransposable ele- 
ments), 19-20 
and horizontal gene transfer, 
458-461 
See also SINEs 
Russian Czar, 168 
RVs (Retroviruses), 460 


Sabre-toothed cat, fossil DNA, 470 
Salmonid fishes 
allozyme heterozygosity and fit- 
ness, 38 
homing and population struc- 
ture, 270-271 
phylogeography, 502-504 
sympatric speciation, 347-348 


Saltational speciation, 325 
scnDNA (single-copy nuclear 
DNA) 
in clonal analyses, 184 
molecular clock calibration, 124 
in parentage analyses, 199 
and PCR technique, 80-82 
and RFLP techniques, 79-82 
Sea stars, clonal reproduction, 174 
Sea turtles. See Marine turtles 
Seaside sparrows, conservation 
genetics, 515-518 
Seed banks, 497 
Seeds, dispersal and population 
structure, 257-258 
Selective sweeps, 44, 45, 46, 187 
Self-compatibility, and speciation, 
328-329 
Self-fertilization 
avoidance in plants, 44 
in plants, 217-218 
restrictions on, 256 
See also Autogamy 
Self-incompatibility 
in plants, 216-217 
and speciation, 328-329 
Selfish genes 
evolutionary perspective, 21 
and selection, 46 
Semi-species, 331 
Septicemia, 190 
Sex determination, modes of, 196 
Sex typing, 194-195 
Sexual bias 
in hybrid zones, 367-370 
and introgression, 378-379 
Sexual selection 
in birds, 206-212, 351 
in fishes, 216, 350 
parentage analyses, 202-204 
and speciation, 350-351 
Seychelles warbler, gender identi- 
fication, 495 
Shrimp, eusocial colonies, 239-240 
Sibling species, 331 
See also Cryptic species 
Silversword, parentage analyses, 
493—494 
Silvery minnow, conservation 
genetics, 519 
SINEs (Short interspersed ele- 
ments) 
phylogenetic resolution, 113 
technique and applicability, 
96-97 
Single-locus data, 111: 
Sister taxa, 117 


SIVs, and HIV, 316-319 
Slime molds, genetic chimeras, 
192-193 
Snails, sperm competition in, 
226-227 
SNPs (Single nucleotide polymor- 
phisms), technique, 97-98 
SNPSTRs (Short autosomal 
regions using STRs), 98 
Social parasitism, 355 
Sociality. See Eusocial colonies; 
Non-eusocial groups 
Society for Conservation Biology, 
476 
Sonoran topminnow, genetic vari- 
ability in, 480, 487 
Southeastern United States, phylo- 
geography and genealogical 
concordance, 307-310, 
312-314 
Southern blotting, 69-70 
Speciation 
allopatric, 349-350, 351-353 
allozyme evidence, 331-334 
candidate gene, 334, 336-337 
co-speciation, 353-355 
and conservation biology, 
515-526 
definition, 321 
ecological, 350-351 
founder events, 329, 338-341 
gamete recognition, 337-338 
by hybridization, 388-398 
latitudinal gradients, 345 
mating systems, 328-329 
Mendelian approaches, 325-331 
microevolution, 330 
modes of, 326 
molecular clocks, 345-346 
and paraphyly, 339-341 
parthenogenesis, 393-395 
phylogenetic signatures, 341-343 
polyploidy, 327-328, 396-398 
punctuated equilibrium, 342-345 
rates and genetic divergence, 
342-346 
recombinational, 388-389 
sudden vs. gradual, 325-330 
sympatric, 346-35] 
time for, 327-329, 351-353, 365, 
429-431 
unisexual biotypes, 392 
Speciation genes, 334-338 
Species concepts 
BSC and phylogenetic, 361-363 
historical, 321-325 





Species flocks, in fishes, 347-350 

Sperm displacement, 225 

Sperm sharing, 226-227 

Sperm storage, 223-224 

Spontaneous origin hypothesis, 
396-397 

Squirrels, kinship in colonies, 242 

SSCPs (single-strand conforma- 
tional polymorphisms), tech- 
nique, 97 

SSLPs (simple-sequence length 
polymorphisms). See 
Microsatellites 

Starch-gel electrophoresis (SGE), 
technique, 58-59 

State v. Andrews, Orange County, 
Florida, DNA evidence, 167 

Stepping stone model, of gene 
flow, 252 

Stream hierarchy model, 509 

STRs (Short tandem repeats). See 
Microsatellites 

Sturgeon, conservation genetics, 
519 


Sudden speciation, 327-329 
Sunflowers, reticulate evolution 
in, 383-385 
Supergenes, 187 
Superoxide dismutase, 42 
Supertrees, 462-464 
Sweat bees, PCM of sociality, 
412-413 
Sweepstakes dispersal, 422, 425 
Sweepstakes reproduction, 
262-263 
SwissAir Flight 111, 168 
Symbiosis 
fig-wasp, 357 
See also Endosymbiosis 
Sympatric speciation, 346-351 
Symplesiomorphy, 116 
Synapomorphy, 116, 117 
Syntopic populations, 323 
Systematics 
cpDNA and, 438-443 
DNA-DNA hybridization and, 
433-435 
future of biodiversity assess- 
ment, 360 
morphological, 115-116 
mtDNA and, 434—438 
universal standards, 10-14, 
464—467 


T w definition, 64 

Tajima's D, 43 

"The Tapestry", 433 

Tag polymerase, PCR technique, 87 


Tasmanian wolf, fossil proteins, 
466 
Taxonomic disparity, 467 
Taxonomic traits 
plasticity, 6 
See also Phenotypic traits 
Taxonomic uncertainties, and 
genetic divergence, 333 
Taxonomy, and conservation biol- 
ogy, 515-521, 522-525 
Temporal duration, speciation, 
351-353, 365 
Temporal information, phyloge- 
netic trees, 463-465 
Temporal scales, and phylogeogra- 
phy, 423-425 
Tent caterpillars, kinship in 
colonies, 241-242 
Termites, eusocial colonies, 240 
TEs (Transposable elements), 
19-20 
and horizontal gene transfer, 459 
Tetrapods, phylogenetic character 
mapping, 409-410 
Thermal elution, 64-67 
Thresholds, in phylogenetic char- 
acter mapping, 404 
Transferrin, MCF assay, 57 
Transilience model, of speciation, 
329 
Transmission genetics, of eusocial 
colonies, 236 
Transplantations, of species, 514 
Transposable elements. See TEs 
Tree frogs 
polyploid species, 327 
sexual bias in hybridization, 
367-370 
Tree of Life 
future of, 461-466 
phylogeny of domains, 446 
TREE-PUZZLE program, 141 
Tree snails, genetic variability in, 
479-480 
Trees (phylogenetic). See 
Consensus trees; Gene trees; 
Phylogenetic trees 
Trees (plants) 
gene flow and natural history, 
258 
phylogeography in Europe, 297 
tRNA (transfer RNA)-derived 
retroposons, SINEs, 96 
Trophic morphs, fish species, 
347-348 


Trout 
conservation genetics, 527-529 
parentage analyses, 494 
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stock assessment, 502-503 
Tuataras 

conservation genetics, 520 

conservation priority, 536 
Turtles 

DNA sequences, 101-103 

molecular clocks, 128-129 

See also Marine turtles 


Ubiquitous dispersal hypothesis, 
free-living microbes, 297-298 

Unisexual vertebrates, clonal ages, 
180-183 

Universal yardsticks, systematics, 
10-14, 464—467 

Unknown Soldier, DNA finger- 
printing, 168 

UPGMA analysis, phylogenetic 
trees, 134-136 


Vagility, and dispersal distances, 
267-277 
Variegate porphyria, 17 
Vegetative layering, 172-173 
Vertebrates 
age of clones, 180-183 
allozyme-based estimates of het- 
erozygosity, 28 
diagnosing cryptic species, 
357-358 
genealogical concordance, 
310-313 
genetic chimeras, 194 
genetic distance, 11, 12 
population genetic variation, 252 
sex typing, 194-195 
spatial distribution of clones, 178 
speciation by hybridization, 
392-398 
speciation times, 352 
Vicariance biogeography 
Caribbean scenarios, 423-424 
phylogenetic analysis, 418-426 
vs. dispersal, 422, 424 
Virulence, origins, 191-192 
VNTRs (variable number of tan- 
dem repeats). See 
Minisatellites 
Vultures 
convergent evolution, 8-9 
polyphyly, 433 


W-specific markers, 194-195, 273 
Wagner parsimony, 140 
Whales 

historical population size, 496 
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individual tracking, 492 
kinship in pods, 243 
migration and gene flow, 270 
phylogenetic character mapping, 
408—409 
wildlife forensics, 524—525, 526 
white gene, 154 
Whooping cough, 190 
Whooping cranes, parentage 
analyses, 493 
Wild-type alleles, 24, 25 
Wildlife forensics 
conservation genetics, 521-526 
population assignment, 281 


USFWS laboratory, 465-477 
Wildlife management, identifying 
individuals, 490-492 
Wolves 
inbreeding in, 479, 487 
introgression, 530-531 
phylogeny, 530 
Wombats, parentage analyses, 
494-495 


X chromosome markers, human 
origins, 300 
Xanthine dehydrogenase, 154 
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Y chromosome markers 
human patrilines, 300 
mammals, 195, 273 

Yellow fever, vector species, 359 
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